The “Weasel” Saga — With Math (Part 2)
In Part 1, I introduced the “weasel” program, gave an overview of its role as a teaching tool, and gave some analysis of random search, the procedure that is contrasted with cumulative selection. This time, I’ll be going into the behavior of “weasel” and some math relevant to “weasel” itself.
The original description of “weasel” by Richard Dawkins in “The Blind Watchmaker” laid out how the program operated. The essential elements of a “weasel” program are as follows:
- Use a set of characters that includes the upper case alphabet and a space.
- Initialize a population of -character strings by copying with mutation a parent string formed by random assignments of characters from our character set.
- Identify the string closest to the target string in the population.
- If a string matches the target, terminate.
- Base a new generation population of size n upon copies of the closest matching string or strings, where each position has a chance of randomly mutating, based upon a set mutation rate.
- Go to step 3.
Note that I said “program” above and not “algorithm”. There is no guarantee that the program as described will halt, or terminate. While Dawkins did specify that and , he did not share precisely what values he chose to use for and . Because Dawkins did not see the need to discuss alternative methods for ensuring that the program would come to an end, one can reasonably infer that in his own runs of the program, he was using parameters of and that would result in termination of the program. He reported in “The Blind Watchmaker” that in three runs, the target string was matched in 43, 64, and 41 generations. Later, I’ll see about doing some forensics to see if a range of parameters can be estimated for Dawkins’ original runs. It might be the case that I can exclude whole ranges of parameters.
For now, it is more important that you come to a clear understanding of what does go on in a run of “weasel”. To that end, I’ve written a minimalist “weasel” program in Python. Python is an open source interpreted language that is available for a great many different platforms, so you should be able to install Python and run the following program pretty much on any common computer platform. If installing anything is just too much, there is an interactive “weasel” program of mine available here that you can run from any graphical browser with Javascript turned on.
The following code is just 34 lines, three of them output statements, and does nothing particularly fancy, so it should be understandable with small effort on the reader’s part. It’s just straight-up structured programming, so there isn’t even object-oriented abstraction going on here.
[python]
import random # Get Pseudo-Random Number Generator (PRNG)
random.SystemRandom() # Seed PRNG
n = 250 # Set population size
t = “METHINKS IT IS LIKE A WEASEL” # Set target
b = ” ABCDEFGHIJKLMNOPQRSTUVWXYZ” # Set base pool
u = 1.0 / len(b) # Set mutation rate
print “PopSize=%d, MutRate=%f, Bases=%s, Target=%s” % (n,u,b,t)
p = “” # Initialize parent randomly
for ii in range(len(t)): # Make parent the same length as target
p += random.choice(b) # Add a randomly selected base to parent
print ” Parent=%s” % (p)
done = False # Assume we haven’t matched the target; we’ll be wrong once in 1e40 times
g = 0 # Initialize the generation count
while (done == False): # Keep going until a match is found or forever
pop = [] # Previous population is cleared out
bmcnt = 0 # Initialize best match count
bc = “” # Initialize best candidate holder
for ii in range(n): # Over the population size, do this:
pop.append(“”) # Append a new blank candidate
mcnt = 0 # Initialize the match count
for jj in range(len(t)): # Over the candidate length, do this:
if (u >= random.random()): # Test for whether mutation happens here
pop[ii] = pop[ii][0:jj] + random.choice(b) # Add a mutant base
else:
pop[ii] = pop[ii][0:jj] + p[jj] # Copy base from parent
if (pop[ii][jj] == t[jj]): mcnt += 1 # If matched to target, increment
if (mcnt > bmcnt): # If candidate matches more bases than best so far
bmcnt = mcnt # Set the best match count to current match count
bc = pop[ii] # Set the best candidate to the current candidate
if (mcnt == len(t)): # Check to see whether all bases match the target
done = True # When all match up, we are done
g += 1 # Increment the generation count
print “Gen=%05d, %02d/%d matched, Best=%s, Total=%06d” % (g,bmcnt,len(t),bc,g*n)
p = bc # Parent for next gen. is best candidate from this gen.
[/python]
Here’s a sample output from a “weasel” run, generated by the Python code above:
PopSize=250, MutRate=0.037037, Bases= ABCDEFGHIJKLMNOPQRSTUVWXYZ, Target=METHINKS IT IS LIKE A WEASEL Parent=WAPXSPETTBEOUNRCUE AEDT BJPH Gen=00001, 01/28 matched, Best=WAPXSPETTBEOUSRCUE AEDT BJPH, Total=000250 Gen=00002, 02/28 matched, Best=WAPXSPETTBE USRCUE AEDT BJPH, Total=000500 Gen=00003, 03/28 matched, Best=WATXSPETTBE USRCGE AEDT BJYH, Total=000750 Gen=00004, 04/28 matched, Best=WATXSPKTTBE USRCGE AEDT BJYH, Total=001000 Gen=00005, 05/28 matched, Best=DATXSPKTTBE USRCGE AADT BUYH, Total=001250 Gen=00006, 06/28 matched, Best=DATXSPKTTBE USRCGE AALT BSYH, Total=001500 Gen=00007, 07/28 matched, Best=GATXSPKTTBE USRCGE AALTEBSYH, Total=001750 Gen=00008, 08/28 matched, Best=WATXSPKTTBE USRCGE AALWEBSYX, Total=002000 Gen=00009, 09/28 matched, Best=WATXSPKTTBE USRCGEEAALWEBSYX, Total=002250 Gen=00010, 10/28 matched, Best=WATXSPKT BE USRCGEEAALWEBSYX, Total=002500 Gen=00011, 13/28 matched, Best=WATHSPKT BE US CGEE ALWEBSYX, Total=002750 Gen=00012, 14/28 matched, Best=WATHSPKT BE IS CGEE ALWEBSNX, Total=003000 Gen=00013, 15/28 matched, Best=WATHSPKT BE IS CGEE A WEBSNX, Total=003250 Gen=00014, 16/28 matched, Best=WATHIPKT BE IS CDEE A WEWSNX, Total=003500 Gen=00015, 17/28 matched, Best=WATHIOKT BT IS CDEE A WEWSVW, Total=003750 Gen=00016, 18/28 matched, Best=WATHIOKT BT IS ZDEE A WEWSVL, Total=004000 Gen=00017, 19/28 matched, Best=WATHIOKT BT IS LDEE A WEWSVL, Total=004250 Gen=00018, 20/28 matched, Best=WATHIOKT BT IS LIEE A WEWSVL, Total=004500 Gen=00019, 20/28 matched, Best=WATHIOKT BT IS LIEE A WEWSVL, Total=004750 Gen=00020, 20/28 matched, Best=WATHIOKT BT IS LIEE A WEWSVL, Total=005000 Gen=00021, 21/28 matched, Best=WATHINKT BT IS LIEE A WEWSVL, Total=005250 Gen=00022, 21/28 matched, Best=WATHINKT BT IS LIEE A WEWSVL, Total=005500 Gen=00023, 22/28 matched, Best=WATHINKT BT IS LIKE A WEWSVL, Total=005750 Gen=00024, 23/28 matched, Best=WETHINKT BT IS LIKE A WEWSVL, Total=006000 Gen=00025, 24/28 matched, Best=WETHINKT IT IS LIKE A WEWSVL, Total=006250 Gen=00026, 24/28 matched, Best=WETHINKT IT IS LIKE A WEWSVL, Total=006500 Gen=00027, 25/28 matched, Best=WETHINKS IT IS LIKE A WEWSVL, Total=006750 Gen=00028, 25/28 matched, Best=WETHINKS IT IS LIKE A WEWSVL, Total=007000 Gen=00029, 25/28 matched, Best=WETHINKS IT IS LIKE A WEWSVL, Total=007250 Gen=00030, 25/28 matched, Best=WETHINKS IT IS LIKE A WEWSVL, Total=007500 Gen=00031, 25/28 matched, Best=JETHINKS IT IS LIKE A WEWSVL, Total=007750 Gen=00032, 25/28 matched, Best=JETHINKS IT IS LIKE A WEWSVL, Total=008000 Gen=00033, 25/28 matched, Best=JETHINKS IT IS LIKE A WEWSVL, Total=008250 Gen=00034, 26/28 matched, Best=METHINKS IT IS LIKE A WEWSDL, Total=008500 Gen=00035, 26/28 matched, Best=METHINKS IT IS LIKE A WEWSDL, Total=008750 Gen=00036, 26/28 matched, Best=METHINKS IT IS LIKE A WEHSDL, Total=009000 Gen=00037, 26/28 matched, Best=METHINKS IT IS LIKE A WEHSDL, Total=009250 Gen=00038, 26/28 matched, Best=METHINKS IT IS LIKE A WEHSDL, Total=009500 Gen=00039, 27/28 matched, Best=METHINKS IT IS LIKE A WEHSEL, Total=009750 Gen=00040, 27/28 matched, Best=METHINKS IT IS LIKE A WEHSEL, Total=010000 Gen=00041, 28/28 matched, Best=METHINKS IT IS LIKE A WEASEL, Total=010250
The “Total” reported is the total number of candidate strings evaluated in the generations leading up to some candidate string matching the target at all bases. The thing to note is that this didn’t take the stupendous numbers of “tries” that we would expect for the random search case; it shows a relative improvement over random search of over thirty-six orders of magnitude in efficiency. The program runs in just a couple of seconds on my computer; I did not have to wait for the lifetimes of a great many universes to go by. The question of interest is just how “weasel” manages to improve things over random search.
I’ll recap the parameters used by “weasel” for handy within-post reference. Much of what I will discuss is precisely how the parameters change the behavior of the “weasel” program.
The number of alternative forms that any position, or base, can take is . For the case where the bases are capital letters or a space, .
The length of the target string is , and in the case where the target is, “METHINKS IT IS LIKE A WEASEL”,
The per-base mutation rate is how often a base is changed to a random alternative during copying from a parent string to a daughter string, and is .
The number of candidate strings that make up the population at any one time is .
We will also be interested in the number of correct and incorrect bases seen in the best candidate from a generation, and these will be and , respectively.
Now we start looking at just how “weasel”, or cumulative selection, differs from random search or “single-step selection”, as Richard Dawkins termed it. Where, exactly, does “weasel”‘s potential for better efficiency come from?
The first striking difference lies in “weasel”‘s use of inheritance. With single-step selection, whatever properties any individual try might have that were somehow adaptive do not carry over to further tries. This is not so when inheritance is used. But how much difference does that make?
When a parent string is copied with mutation, there are correct bases already. We can determine the expected change in the value of for the daughter string following copying the parent with mutation as
expected change in number of correct bases after copy with mutation
It is good to have a mathematical expression, but what is that telling us?
Here’s a graph showing the results of the equation for all mutation rates and all values :
If our mutation rate is zero, that becomes
That corresponds to the bottom edge of the graph, where you can see that a mutation rate of zero produces an expectation of zero bases changing to correct throughout the range.
The random search of single-step selection is equivalent to setting . One need not do anything special to a “weasel” program to also do single-step selection; just change the mutation rate so that . That corresponds to the top edge of the graph, and shows that the expected change in correct bases is almost as many bases becoming incorrect as were correct in the parent when one is randomly changing all the bases.
So on the one hand, we already know that random search () is ineffective: that much change destroys any chance that inheritance can make a difference. And on the other, perfect copying () is both ineffective and boring: we persist with evaluating exactly the same string over and over.
For the case of “weasel”, the equation tells us that it is easy and expected to obtain correct bases when is close to zero, depending on what the mutation rate is. However, that quickly changes as increases. For any particular candidate string, the expectation for in the newly generated candidate as the of the parent approaches only remains near zero, and thus preserves the correct bases from the parent, when the mutation rate is very small. Because close to means that there only a small number of incorrect bases remaining, the value of the positive term in the equation decreases, while the value of the negative term in the equation increases. A higher mutation rate, , makes the equation lean heavily in the negative direction, because it is a lot more likely that one or more of the many correct bases will change to an incorrect base than it is that one of the few incorrect bases will be changed to a correct base. So we see that the most interesting results come about when 0.0 < \mu \ll 1.0[/latex], that is, a small non-zero mutation rate far away from the behavior one gets with complete randomness. Unsurprisingly enough, this is also the situation that we see in biology, where mutation rates are small, non-zero values. While Richard Dawkins did not discuss mutation rate specifically with respect to his "weasel" program, he did discuss it in the context of his "biomorphs" program, where a mutation rate high enough to cause one mutation in each daughter was said to be "very high" and "unbiological". With a small mutation rate, most of the time a daughter candidate or organism will be pretty much like the parent, with a somewhat higher chance that something will go wrong with what already works than that something will go right in what isn't working currently. Increasing the mutation rate substantially will, at least from the standpoint of an individual daughter candidate, quickly add to the risk of lots more going wrong than going right. And that's the bottom line from the graph. This is all pretty simple and straightforward so far. The next step, though, causes a lot of people trouble. And that step is to go beyond thinking in terms of the changes that occur between individuals in successive generations, and to start thinking in terms of the dynamics of populations. It is a common theme in evolutionary science that the primary shift in view between the biology that pre-dated general acceptance of evolution and after that acceptance was the adoption of the view that populations were the primary objects of biological interest. We'll take that next step in the next part.