Midnight Oil: surprises while porting to IPy 2

It's amazing how many assumptions your code can gather when you're not looking...

I'm now in the middle of porting my code from IronPython 1.1.1 to version 2.0 (beta 5). So far it's been more work than I expected and less work than it would have been without the excellent support from the IPy team and the community.

Since the hosting API has changed completely, I expected most of the work to be around getting the C# hosting code working again. I then expected some bugs in the new version since it's still beta, but not too many, since it's already beta number 5 (next is supposed to be RC1).

So that's basically what happened, except there was one more thing - I expected my code itself to be basically correct.

What surprised me was the last phase where my unit tests stopped failing because of hosting problems, bugs in IPy2 or incompatibilities between the versions, and started failing because of bugs in the original code that surfaced because the "environment" had changed in subtle ways.

Here's one example. Have a look at the following code:
print list(set([1,0,3,2,4]))

The output is consistent in both versions, but different:
IPy 1.1.1: [1, 0, 3, 2, 4] # same as the input sequence's order
IPy 2.0b5: [0, 1, 2, 3, 4] # ordered

now, sets and dictionaries don't promise anything about order, so I didn't assume anything. I'll rephrase - I didn't think I assumed anything...

One place that was affected by this is our control decision logic. That module looks at what the current state is, compares it to how it wants things to look and performs actions to bring it closer to the desired state. It contains many smaller "checks" that look at specific parts of the state and are in charge of affecting specific actions.
Many checks are independent of each other, so it doesn't matter which one runs first. I basically keep them in a dictionary and iterate over it to perform all checks. Sometimes checks or actions do depend on one another, so I need to add code to synchronize them (e.g. don't perform this action until that other action is finished)

well, you can guess what happened - the change to IPy 2.0 ran my checks in a different order and exposed some hidden "race conditions".

There are other examples, but post is already longer than I intended. I'll just say that I learned the difference between __builtin__ and __builtins__

My conclusions from this are:

If it's not tested it doesn't work. And when you change the environment under which your code runs you need to retest.

The unit tests paid off again, since they allowed me to find many problems in places I didn't expect.

I need to do more to flush out these hidden assumptions. One way is to add a randomized test that runs over longer periods and plays with some variables. I could easily randomize the order in which I go over the checks, inject random errors (the system is supposed to be self healing) or delays in some strategic points, etc. The not-so-easy part is verifying the system behaved correctly, and being able to reproduce problems once they surface.