[Edited Dec 6, 2010 to mention another solution based on zip and iter.]
Suppose you want to divide a Python list into sublists of approximately equal size. Since the number of desired sublists may not evenly divide the length of the original list, this task is (just) a tad more complicated than one might at first assume.
One Python Cookbook entry is:
def slice_it(li, cols=2): start = 0 for i in xrange(cols): stop = start + len(li[i::cols]) yield li[start:stop] start = stop
which gives the exact number of subsequences, while varying the length of the subsequences a bit if necessary. It uses Python's slicing feature to get the lengths.
That was written in response to an earlier cookbook entry which had the following one-liner:
[seq[i:i+size] for i in range(0, len(seq), size)]
I like that it's a one-liner but don't like a couple of things about it. If your goal isn't a particular sublist length but rather to divide the list up into pieces, you need another line to compute the size. And then it doesn't turn out too well. Suppose you want to divide a string of length 10 into 4 substrings:
>>> size=10/4 >>> size 2 >>> seq = [1,2,3,4,5,6,7,8,9,10] >>> [seq[i:i+size] for i in range(0, len(seq), size)] [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
This leaves us with one substring more than desired.
Try setting size to 3 to get fewer substrings:
>>> size=3 >>> [seq[i:i+size] for i in range(0, len(seq), size)] [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
This leaves us with dissimilar lengths.
Here's a briefer one-liner using the slice idea, which doesn't require you to compute the length in advance, and does give the exact number of subsequences you want and with lengths that are more appropriately divided:
[seq[i::num] for i in range(num)]
The drawback here is that the subsequences are not actually subsequences of seq; seq is sliced and diced. But, in many situations that doesn't matter. In any case, all the elements are in the output and the subsequences are as close as possible to the same length:
>>> seq = [1,2,3,4,5,6,7,8,9,10] >>> [seq[i::num] for i in range(num)] [[1, 5, 9], [2, 6, 10], [3, 7], [4, 8]]
Update: I just read about a clever and interesting solution involving zip and tier that works in Python 2.7:
>>> items, chunk = [1,2,3,4,5,6,7,8,9], 3
>>> zip(*[iter(items)]*chunk)
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]
Read a full explanation at Go deh! The disadvantage from a practical point of view is that the if the list of items is not evenly divisible into chunks, some items get left out. But I still like it because it's illuminating about the nuances of iterators.
Thanks for posting this... I was looking for an elegant way of partitioning python lists.
Posted by: Dave Dash | October 17, 2008 at 04:38 PM
Thanks for the list comprehension info. I was looking for that solution exactly, thanks for posting it :)
Posted by: Eric Pavey | September 28, 2009 at 01:26 PM
Very nice.
Posted by: Dale | August 17, 2010 at 07:18 PM
Elegant and uses the language the way it was meant to be used. I looked at the other items in the ActiveState cookbook and thought, "clunky! I just want a way to parse a long list into smaller lists of a given length."
Thanks for the insight.
mp
Posted by: Michael Powe | August 24, 2010 at 06:27 PM
As I've tried saying in comment section of Go Deh! blog (and failed), you can get behaviour of slicing and not loosing any items by using zip_longest function instead of zip.
Posted by: nagisa | December 06, 2012 at 02:43 PM
As I've tried saying in comment section of Go Deh! blog (and failed), you can get behaviour of slicing and not loosing any items by using zip_longest function instead of zip.
Posted by: nagisa | December 06, 2012 at 02:43 PM
Posted by: Tom Lynn | December 06, 2012 at 02:50 PM
nagisa, thanks for the note about zip_longest, but I'm not sure what it is. Do you mean izip_longest from itertools? That gives a different result, because it puts the remainder items into a separate tuple, filled out out with Nones. Could be useful in some cases, so it's good to know about.
>>> items, chunk = [1,2,3,4,5,6,7,8,9, 10], 3
>>> list(izip_longest(*[iter(items)]*chunk))
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, None, None)]
Posted by: Gary | December 06, 2012 at 03:07 PM
Or an integer equivalent:
Posted by: Tom Lynn | December 06, 2012 at 03:21 PM
wonderful solution. thanks for sharing.
Posted by: rodney | February 20, 2013 at 07:49 AM
Thank you for posting this. I had been looking for a way to do this, but everything else I found was how to generate chunks of size n, rather than n chunks.
Posted by: Roger Iyengar | June 29, 2016 at 12:53 PM
Thanks for the post! It has no doubt helped many.
I realize the post is 10+ years old. Did you know numpy now has a function that does this?
Posted by: List Splitter | June 10, 2019 at 05:44 PM
Nope, didn't know that about the numpy function. Thanks for commenting, it's amazing to see this post still reaching anyone after 10+ years!
Posted by: Gary Robinson | June 10, 2019 at 06:18 PM