Genetic Programming with Wolfram Engine¶

I've been using various forms of evolutionary programming or genetic algorithms ever since my post-doc days way back in the late 90's. In the Wolfram Engine I've been mostly writing my own every time I want to optimize something, but I thought I would push the DifferentialEvolution method for NMinimize to see if it could make it my general purpose tool for this. I'm pretty pleased with the results!

Differential Evolution is a little different than how I've usually built genetic algorithms. It uses a population that are all tested. Then, to make the next generation, it makes new candidates through randomly weighted linear combinations of 4 current candidates. If the new one is better than the first of those four, it's kept.

Here's a simple use case for NMinimize:

In [1]:

NMinimize[Sin[0.23 x],x]

Out[1]:

And here's how to make it optimize on just the integers:

In [3]:

NMinimize[{Sin[0.23 x],x>0}, Element[x,Integers]]

Out[3]:

Here's some fancier code to capture all the generations (using Reap and Sow along with StepMonitor and accessing the very useful Optimization sub variables:

In [4]:

{fit, intermediates} = 
   Reap[NMinimize[{Sin[0.23 x], x>0}, Element[x,Integers], 
     MaxIterations -> 40, 
     Method -> {"DifferentialEvolution"}, 
     StepMonitor :> 
      Sow[{Optimization`NMinimizeDump`vecs, 
        Optimization`NMinimizeDump`vals}]]];

In [5]:

fit

Out[5]:

The intermediates variable has all the tries for all the generations.

Here I avoid a couple of nagging warnings by having the function only work on integers (the first line below). What's interesting is that if you check the intermediates variable you'll see non-integer tries but after checking carefully, I'm now sure that they just use the Round[] function on any non-integers if you require integer solutions.

In [ ]:

f[x_Integer]:=Sin[0.23 x]
{fit, intermediates} = 
   Reap[NMinimize[{f[x], x>0}, Element[x,Integers], 
     MaxIterations -> 40, 
     Method -> {"DifferentialEvolution"}, 
     StepMonitor :> 
      Sow[{Optimization`NMinimizeDump`vecs, 
        Optimization`NMinimizeDump`vals}]]]

This shows a plot of the solutions (black dots) where time runs through the generations. Remember that we're solving for integers here, that's why it doesn't find the bottom-most trough.

In [34]:

baseplot=Plot[Sin[4.342 x]-Exp[-(x-13.3)^2],{x,0,20}];
frames=Table[Show[baseplot,Graphics[Point[{#,0}&/@set]]],{set,sets}];
ListAnimate[frames]

Out[34]:

Optimizing student group choices¶

One of the problems that I've used evolutionary programming to tackle in the past is letting students rank the projects they want to work on and then assigning students to project groups to optimize the "happiness" of the whole class. Obviously if everyone can just be in their first choice you're done, but often that leads to uneven group sizes or other logistical challenges. If you can quantify those challenges, you can use an approach like this to solve it.

Here's a random sampling of 12 fictitious students. Each row represents a student who has ranked four projects 1-4, where 1 means it's their top choice:

In [11]:

rankings=Table[RandomSample[Range[4]],{12}];
rankings//TableForm

Out[11]:

Here I define a fitness function to test a particular assignment of students. That assignment would look like an array of 12 integers from 1 to 4. The ith number in the array represents the group that the ith student is assigned to. We check how the student has ranked that and use that to (partially) determine the fitness of all the assignments. For this fitness function I do three things:

Find how happy each student is
Check to make sure the groups are all equal sizes (that's the {3,3,3,3} part on line 3)
For fun, I'm giving a penalty of 10 points if the first two students aren't in the same group (the placement part)

then I just add all those up. The minimum possible is 12 (all 12 students in their first choice with 4 evenly distributed groups and the first two students in the same group). Looking above at the rankings you'll note that we can't put both the first two students in the same group without making one of them mad.

In [39]:

Clear[fitness]
fitness[try : {__?NumberQ}]:=(placement=Total[Table[rankings[[i,try[[i]]]],{i,Length[try]}]];
    distribution=Total[(Sort[Count[try,#]&/@Range[4]]-{3,3,3,3})^2];
    firstTwo=Piecewise[];
    placement+distribution+firstTwo);
fitness[tmp=RandomInteger[{1,4},{12}]]

Out[39]:

Now we try to use NMinimize to find the assignments of students that minimizes the fitness function or maximizes the class happiness:

In [42]:

groupsx=Array[groupings,12];
{fit, intermediates} = 
   Reap[NMinimize[{fitness[groupsx], 1<=#<=4&/@groupsx}, Element[groupsx,Integers], 
     MaxIterations -> 100, 
     Method -> {"DifferentialEvolution", 
       "InitialPoints" -> Table[RandomInteger[{1,4},{50,12}]]}, 
     StepMonitor :> 
      Sow[{Optimization`NMinimizeDump`vecs, 
        Optimization`NMinimizeDump`vals}]]];
fit

NMinimize::cvmit: Failed to converge to the requested accuracy or precision within 100 iterations.

Out[42]:

Here's a plot showing two things. In orange you see the best student assignments as a function of generation. In blue you see the range of fitness values (so max minus min) for each generation. That's a decent measure of how much diversity exists still in that generation.

In [268]:

ListLinePlot[{Max[#]-Min[#]&/@intermediates[[1,All,2]], Min/@intermediates[[1,All,2]]},PlotRange->All]

Out[268]:

Here I write a function that'll help us see the progress. Each student gets a vertical bar and in that bar colors represent how often (in that generation) they're in their 1st, 2nd, 3rd, or 4th choice. After that I show an animation of the 100 generations.

In [50]:

displayGeneration[generation_]:=(
    happiness=Table[rankings[[i,#[[i]]]], {i,12}]&/@generation;
    t=Table[Count[#,i],{i,4}]&/@Transpose[happiness];
    BarChart[t,ChartLayout->"Stacked"])

In [51]:

frames=displayGeneration/@Round[intermediates[[1,All,1]]];

In [55]:

ListAnimate[frames]

Out[55]:

Note how the second student ends up in their 3rd choice. Also note that the 10th student ends up in their 2nd choice. Both of those students are "taking one for the team" to optimize the happiness of the whole class.

Your thoughts?¶

I'd love to hear your thoughts. Here are some starters for you:

This is cool! Have you thought of . . .?
This is dumb! Haven't you ever heard of . . .?
If the intermediates contain non-integers, does that mean the linear combination of Differential Evolution failed? (I don't think so, I think the Round[] function is a fine fix)
The Round[] function is NOT a fine fix!
I carefully tracked the animation with the plot above it and I think you ran this with two different random seeds! Loser.
I still don't think you should switch from saying Mathematica to Wolfram Engine
What if a student wanted to rank two groups the same?
What if you wanted to guarantee a student a particular group (maybe because it was their idea)?
I would have never found those Optimize variables! (thank you StackExchange)