Saturday, May 31, 2008

Combinations

Ok. Now that we got started, here's a short post about one of my favorite snippets of python.
Haven't seen this anywhere. Let me know if you have...

A lot of times you've got several groups and want to go over all combinations of items (for example, for testing all combinations of a set of parameters).

Doing this for two groups is a one liner, thanks to list comprehensions:
>>> A = [True,False]
>>> B = [1,2,3]
>>> [(a,b) for a in A for b in B]
[(True, 1), (True, 2), (True, 3), (False, 1),
(False, 2), (False, 3)]
This is a great start, but I want something that works for any number of groups. So we just need to apply this one liner as a building block iteratively until we're left with a single group, which is exactly what the built in function reduce does! well, almost...
>>> def combinations(*seqs):
... def comb2(A,B):
... return [(a,b) for a in A for b in B]
... return reduce(comb2,seqs)
...
>>> A = [True,False]
>>> B = [1,2,3]
>>> C = ['yes','no']
>>> combinations(A,B,C)
[((True, 1), 'yes'), ((True, 1), 'no'), ...]
The problem is that the result is nested. Instead of getting (True,1,'yes'), we get ((True,1), 'yes').
The solution is to change the building block so it treats the two arguments differently. The second argument will still be a regular sequence of items. The first argument will now be a sequence of combination groups built so far.
Our building block now becomes:
//for each group and item, append the item to the group
def comb2(A,b):
return [a+[i] for a in A for i in b]
But now we need to handle start conditions, since we don't have any "group of groups" when we start. And this is the fun part - after a few iterations with the interactive shell, I ended up with this, which I think is quite cute:
>>> def combinations(*seqs):
... def comb2(A,b):
... return [a+[i] for a in A for i in b]
... // prepend a sequence with a single empty group
... return reduce(comb2,seqs,[[]])
...
>>> combinations(A,B,C)
[[True, 1, 'yes'], [True, 1, 'no'], ...]
And that's that.

At the time I was coding mainly in C++. Doing this in C++ is going to be much more work and end up being much less elegant. But what really blew me away at the time was this:
Suppose I wanted to handle cases where the number of combinations was very big. In that case generating them all up front could take up too much memory, and I'd just want to generate them on the fly as I iterate over them. You can do this in python by replacing the comb2 list comprehension with a generator expression:
def comb2(A,b):
return (a+[i] for a in A for i in b)
Now I can happily iterate over the returned generator and python manages the stack of nested generators, each with it's own state in the iteration.
Try that in C++! (you can do it, but it's going to hurt, which in reality means that if it's not very important for you, you won't do it).

I remember hearing GvR at PyCon 2006, when he talked about the evolution of python. One of the things he said was that it was a pretty lousy functional programming language. I haven't learned any good functional language yet (want to try haskell or F# sometime), and I trust he knows what he's talking about, but still, this beats the hell out of C++, Java or C# (although C# 3.0 now has some new features that could help. Would be interesting to see how easy it is to do this now).

On the same note, take a look at this, which is also neat, and comes in handy quite often.

and btw, if someone has a tip on how to format code sections for the blog, let me know :-(

Getting Started

About a year ago a coworker asked me whether I wanted to write a blog. I took a full 2 seconds to make sure I wasn't just reacting and just said - "no".

It was that clear. I am a private person by nature. Why would I want to talk to a bunch of people I don't even know? It made no sense. I mean, I could understand why some other people would want to, and even enjoyed reading some blog posts (most of which I got from friends by email), but me blogging was an idea that made no sense to me.

So what's this, then?
Well, to be honest, I'm not 100% sure myself :-)

One thing that happened is that for the last year I finally got to work in Python (IronPython to be exact). Before that I was mainly writing in C++ and in the last few years also dabbling with python for side projects. I think the effect of the community is much more important in the python ecosystem than it is in C++. I've got some thoughts on why this might be so, but that's not important for this post. Maybe it's not even true, but the fact is that I've started reading some blogs, subscribing to the ironpython mailing list, and in general I got a lot of help and ideas from other people's work.
And when I had some cool stuff I'd written, or some insight, I sometimes wanted to be able to share it.

Luckily I also started working with a friend that blogs. I really learned a lot from watching and talking to him. In this respect, I learned that not every blog has to have a capital B. I could pay something forward by sharing the little things I do. And sometimes someone will google them and it will help him. Cool!

He also sent me this post which I really liked. Well, he forgot to mention "I have no time", but apart from that it's spot on. Time still is a real limitation. Working at a startup and having two (cute as hell) little kids at home doesn't leave a lot of spare time. So blogging will have to compete for the time slot before I go to sleep (hence the blog's name). But that's easy - if I have time and something to say, I'll write. If not, I don't have to.

The last reason to write is because I don't feel comfortable with it. Being a perfectionist I'm worried that I might say something silly, or trivial, or maybe just that no one will care.
All these things will probably happen, but as a father I keep trying to teach my kids that making mistakes is ok. If you're not making mistakes you're obviously not trying things that are hard for you, so you're not learning as quickly as you could. Damn those Genes! the least I can do is set an example by getting better at making mistakes :-)

So now we're done with static friction and this meta-post, I'll try and write a short programming one soon.