Evolution in action :: Giles' blog

Evolution in action

Posted on 3 October 2008 in Programming, Resolver One

At Resolver Systems, we've recently split into two teams; about two thirds of us work on the core Resolver One platform that is our main product (this group is inventively called the Platform team), and the other third build new spreadsheet/Python programs, using Resolver One, for specific clients' custom needs (the Apps team). This is great, because we are now not only building business solutions for people, as well as a generic platform (which means more money for us), but we are also dogfooding -- so we can be sure we're adding features and fixing bugs which really do help our users.

The problem with doing this is that everyone in the company has different preferences about how much time they want to spend in each team. Some people really like writing programs to fix business problems, and others are keener on abstract algorithms. We could have just said "stuff it" and swapped people around so that everyone was doing a 1:2 rotation, but it was much more fun to solve the problem in software :-) My aim was to somehow generate, for each of the next twelve iterations (the two-week development cycles we work in), a list of people who would form that iteration's Apps team and the people who'd form the iteration's Platform team.

So I put together a spreadsheet: an evolutionary algorithm for team scheduling. ~~If you're using Windows, you can download it and take a look (you can get a free version of Resolver One to run it on if you haven't already).~~ You enter your team's preferences -- in terms of the percentage of time they'd like to spend on the Apps team -- in the "Preferences" sheet (which also shows some results from the last run), and then some numbers to guide the evolution (number of generations, population size, etc) in the "Parameters" sheet, and then get the best schedule it can generate in "Rota" sheet.

To be honest, it's using the spreadsheet more as a display mechanism than anything else. But it's a fun bit of code, although I'm sure that anyone who actually works on evolutionary algorithms would find it trivially simple (and probably broken :-). The function GenerateSchedule in the pre-formulae user code (for Resolver newbies: in the box below the grid - the section with a green background) is the interesting bit -- everything below there is just presentation logic. Here's how it works:

We generate a random set of schedules, each of which is created by picking three random people from our team and putting them into the Apps team, leaving the remainder in the Platform team.
We then run through as many generations as the user specified. In each generation:
- Every schedule in our population is assigned a weight. This is generated by a function called WeightSchedule, which is what people who study evolutionary algorithms would call a fitness function. Basically, the higher the number it returns, the less good the schedule is.
- We sort the schedules by their weights, and then we kill off the worst of them.
- We then create a new generation comprising the survivors from the cull, and a set of new schedules that are "parented" by those survivors, using the function MutateSchedule. We apply a slight bias so that the fitter schedules have a better chance of reproducing than the others.
- And on we go for another generation.

WeightSchedule was the most difficult function in the code to get right. (This is in keeping with what I've heard about evolutionary algorithms in general.) Its job is to return a number that is high for bad schedules, and low for good ones. I found I got the best results by returning an arbitrary "high" value for any schedule that failed to meet certain must-have criteria, and then working out, for each person, the difference between the amount of time they wanted to spend in a given team and the actual amount of time they spent there in the current schedule. I then raised those per-person errors to the power of four (to make it clear that three people 5% out is better than one person 15% out) and then summed the results. This seemed to work just fine.

For MutateSchedule I had a bit of fun. It's purpose is to generate a new child from a single parent schedule (I chose to use an asexual reproduction model because, in my experience, sexual reproduction and spreadsheets rarely mix well). My initial implementation just switched one pair of people around for every iteration -- that is, one person who was originally on the Apps team was now on the Platform team, and vice versa. I then made the number of such swaps a user-settable parameter, so that people could increase the extent of mutations. This sounded like a good idea, but didn't help much -- indeed, increasing the number of swaps invariably made the system less likely to produce a good schedule. My "background radiation" level was clearly too high. So I then changed things so that you could specify a fractional number of swaps. A swap level of 0.1 meant that each iteration has a one in ten chance of a having someone swapped around. This seemed to work well -- indeed, 0.1 seemed pretty close to the sweet spot for the number of swaps. I suppose this makes sense -- you can imagine that a schedule with twelve iterations in it that is almost perfect is more likely to be improved if you switch around two people in just one of its iterations than if you make a swap for every iteration.

So that's it -- a simple evolutionary algorithm in a spreadsheet. I've deliberately not over-tidied the code in the version you can download above -- I've just sanitised the data so that no-one on the team's privacy is harmed, and then added a few comments for the more impenetrable bits of code. But it should all be pretty easy to understand, and I'd love to hear from anyone with comments (especially if they know more about this kind of thing than me...)