Saturday, May 31, 2008

Combinations

Ok. Now that we got started, here's a short post about one of my favorite snippets of python.
Haven't seen this anywhere. Let me know if you have...

A lot of times you've got several groups and want to go over all combinations of items (for example, for testing all combinations of a set of parameters).

Doing this for two groups is a one liner, thanks to list comprehensions:
>>> A = [True,False]
>>> B = [1,2,3]
>>> [(a,b) for a in A for b in B]
[(True, 1), (True, 2), (True, 3), (False, 1),
(False, 2), (False, 3)]
This is a great start, but I want something that works for any number of groups. So we just need to apply this one liner as a building block iteratively until we're left with a single group, which is exactly what the built in function reduce does! well, almost...
>>> def combinations(*seqs):
... def comb2(A,B):
... return [(a,b) for a in A for b in B]
... return reduce(comb2,seqs)
...
>>> A = [True,False]
>>> B = [1,2,3]
>>> C = ['yes','no']
>>> combinations(A,B,C)
[((True, 1), 'yes'), ((True, 1), 'no'), ...]
The problem is that the result is nested. Instead of getting (True,1,'yes'), we get ((True,1), 'yes').
The solution is to change the building block so it treats the two arguments differently. The second argument will still be a regular sequence of items. The first argument will now be a sequence of combination groups built so far.
Our building block now becomes:
//for each group and item, append the item to the group
def comb2(A,b):
return [a+[i] for a in A for i in b]
But now we need to handle start conditions, since we don't have any "group of groups" when we start. And this is the fun part - after a few iterations with the interactive shell, I ended up with this, which I think is quite cute:
>>> def combinations(*seqs):
... def comb2(A,b):
... return [a+[i] for a in A for i in b]
... // prepend a sequence with a single empty group
... return reduce(comb2,seqs,[[]])
...
>>> combinations(A,B,C)
[[True, 1, 'yes'], [True, 1, 'no'], ...]
And that's that.

At the time I was coding mainly in C++. Doing this in C++ is going to be much more work and end up being much less elegant. But what really blew me away at the time was this:
Suppose I wanted to handle cases where the number of combinations was very big. In that case generating them all up front could take up too much memory, and I'd just want to generate them on the fly as I iterate over them. You can do this in python by replacing the comb2 list comprehension with a generator expression:
def comb2(A,b):
return (a+[i] for a in A for i in b)
Now I can happily iterate over the returned generator and python manages the stack of nested generators, each with it's own state in the iteration.
Try that in C++! (you can do it, but it's going to hurt, which in reality means that if it's not very important for you, you won't do it).

I remember hearing GvR at PyCon 2006, when he talked about the evolution of python. One of the things he said was that it was a pretty lousy functional programming language. I haven't learned any good functional language yet (want to try haskell or F# sometime), and I trust he knows what he's talking about, but still, this beats the hell out of C++, Java or C# (although C# 3.0 now has some new features that could help. Would be interesting to see how easy it is to do this now).

On the same note, take a look at this, which is also neat, and comes in handy quite often.

and btw, if someone has a tip on how to format code sections for the blog, let me know :-(

8 comments:

Fuzzyman said...

Hey Ronnie, great to see you're blogging. Nice entry. I always find that nesting list comprehensions mangle my mind - at least I can come back to this as a reference.

You *may* find that some of the stuff in the itertools module can help make your examples even more elegant.

Michael Foord

Ronnie Maor said...

Hey Michael - thanks for the encouragement and the tip!

I actually wasn't expecting anyone that's not a coworker or family member to be reading this yet :-)

I'll go over itertools. At the worst it will be a good opportunity to refresh myself with what's in there - I read it once, but never used it.

Ronnie

Michael Brand said...

Hi, Ronnie,
Yes, it's a cute way to use list comprehensions. I used a similar solution the other week, when wanting to enumerate over all SUBSETS of a list.

BTW, regarding the "sorting for humans" post you link to, you should see my take on this in
http://brand.site.co.il/riddles/200803q.html

Cheers,
M.

Ronnie Maor said...

I still like this code, but for practical purposes you should now use itertools.product which is part of the standard library. no point in writing your own anymore. This was added in Python 2.6

They also threw in permutations and combinations (their combinations is named after the mathematical construct from combinatorics - all subsets of size k)

Dov Grobgeld said...

Hi Ronnie,

Just found your blog. A shame I didn't find it before, so that I got more feeling of your problems... How well.

I also toyed around with a similar problem in my beginning Python days, but I did it through generators, which has the advantage that it doesn't fill up the memory with the complete list. Here is my permutation solution:

def permute(a):
__if len(a)==1:
____yield a
__else:
____for p in permute(a[1:]):
______for i in xrange(len(a)):
________yield p[0:i]+[a[0]]+p[i:]
__
for p in permute([1,2,3]):
__print p

(You should definitely look for a blogging platform that allows embedding code... Meanwhile do s/^_/ /g.)

But you are absolutely correct that today we should be using itertools and not crafting solutions like this on our own.

Ronnie Maor said...

Dov - the final implementation above already uses generators (it changes the list comprehension to generator expressions).

Dov Grobgeld said...

Hi Ronnie,

Sorry I missed that. I skimmed through your article too quickly, and I wasn't familiar with the (i for i in whatever) generator construct. Neat!

Regarding syntax highlightening I am not familiar with bogger at all, but I found the following link on the web that might help:

http://yacoding.blogspot.com/2008/05/how-to-add-syntax-highlight-to-blogger.html

Dov Grobgeld said...

s/bogger/blogger/