Tuesday, April 5, 2016

Announcing SLiM 2.0: flexible, fast, interactive forward genetic simulations

After finishing my Ph.D. with Andrew at McGill, I more or less vanished from view.  For almost two years, I have toiled in the darkness, chained to my desk, unseen by any in the land of the living.  But now – now, I emerge into the light!  And today, I am announcing the fruit of my labors: SLiM 2.0!


OK, possibly this comparison would be taking things a little bit too far.  :->

SLiM is a forward genetic simulation package.  The old SLiM, which ended at version 1.8, was written by Philipp Messer of Cornell University.  Then Philipp hired me into his lab, and – cue quick-cut sequence of toil, chains, and darkness – I have rewritten it and greatly extended it.  We're very excited about it, because it brings a new level of flexibility, power, and usability to the realm of forward simulations.

The first thing to note about SLiM 2.0 is that it has an interactive user interface called SLiMgui, for Mac OS X.  You can run SLiM at the command line, too, so it can be used on Linux, on computing clusters, and so forth; but when you're developing a new model, the easiest way to do it is in SLiMgui, which looks like this (click to see at full size):


This shows a six-subpopulation model midway through a run.  The six subpopulations have been set up as an "island model", connected by migration to form a chain with gene flow predominantly in one direction (like sites along a river, with gene flow tending to go downstream, for example).  SLiMgui is showing the population structure in the subwindow at lower right; if you click to see the image at full size, you can see that arrows of different widths indicate the strength of migration between the subpopulations.  SLiM is an individual-based simulation package; every individual in the six subpopulations is shown in the six arrays of colored blocks at top.  The colors indicate fitness; in this model, a mutation has swept through the population, but there is spatial variation in selection that makes that mutation beneficial in some subpopulations (green) but deleterious in others (red).  Running a simulation like this in SLiMgui is completely interactive; we can step forward a generation at a time, inspect individual organisms or even individual mutations, execute commands that modify the simulation even as it's running, and even pull up graphs showing things like the evolutionary trajectories of mutations.

The second thing to note about SLiM 2.0 is that it's scriptable.  The old SLiM ran a model that was specified in a structured input file that listed things like the genetic details and the population structure.  Anything not explicitly supported by the input file was impossible to do (without going in and editing the C++ code in which it was written, anyway).  SLiM 2.0, on the other hand, is scriptable; you write a simulation script in a language called Eidos (which is very similar to R, so if you know R there will be very little learning curve there), and SLiM runs the script.  This means that SLiM 2.0 is now almost completely open-ended.  We've written a manual for SLiM that consists, in large part, of "recipes" for different sorts of simulations, to show what's possible.  We've got recipes for such diverse topics as:
  • complex genetic structure – different types of mutations, different genomic regions, variation in the recombination rate along the chromosome, etc.
  • complex population structure – subpopulations, migration, exponential growth, cyclical changes, context-dependent changes
  • complex selection – spatial and temporal variation in selection, frequency-dependent selection, epistasis, polygenic selection, kin selection
  • complex mating systems – modeling hermaphrodites versus sexual organisms, selfing, cloning, sex ratios, assortative mating, sequential mate search, even gametophytic self-incompatibility
  • complex temporal structure – modeling the introduction of mutations at particular times, detecting the establishment or completion of selective sweeps, making particular events trigger in particular generations or upon particular conditions, and producing customizable model output at particular times.
Since you can do anything you want in your Eidos script, the sky is really the limit; we even have a recipe for a model of social learning of cultural traits that influence fitness, and a model of a "gene drive" powered by the CRISPR/Cas9 genetic modification mechanism.

The third thing to note is that because all this flexibility is controlled through a simple scripting language, all of the recipes mentioned above are less than a page of code.  SLiM models are generally very short and easy to understand.  For example, to set up a six-subpopulation island model like the one shown in the screenshot above, we could do this:

initialize() {
   initializeMutationRate(1e-7);
   initializeMutationType("m1", 0.5, "f", 0.0);
   initializeMutationType("m2", 0.5, "f", -0.1);
   initializeGenomicElementType("g1", m1, 1.0);
   initializeGenomicElement(g1, 0, 99999);
   initializeRecombinationRate(1e-8);
}
1 {
   for (i in 0:5)
      sim.addSubpop(i, 500);
   for (i in 1:5)
      sim.subpopulations[i].setMigrationRates(i-1, 0.001);
   for (i in 0:4)
      sim.subpopulations[i].setMigrationRates(i+1, 0.1);
}

The first code block sets up basic simulation parameters like the mutation rate and the genomic structure.  The second block uses for loops to create the six subpopulations and then connect them through migration.  This model simulates only neutral mutations.  To then add a new mutation that is under selection, we can add this code:

100 late() {
   mut = p0.genomes[0].addNewDrawnMutation(m2, 10000);
   p0.genomes[1:49].addMutations(mut);
}

That creates a new mutation in generation 100, and then adds 49 more copies of the new mutation to other individuals, perhaps simulating gene flow from an external source that introduced the new allele into the system.  We can track the fate of our new allele and print a message if it fixes or is lost:

100:10000 late() {
   if (sim.countOfMutationsOfType(m2) == 0)
   {
      fixed = any(sim.substitutions.mutationType == m2);
      cat(ifelse(fixed, "FIXED\n", "LOST\n"));
      sim.simulationFinished();
   }
}

Finally, we can set up spatial variation in selection for the introduced mutation, making it beneficial toward the lower (downstream) end of our subpopulation chain, and deleterious toward the higher (upstream) end:

fitness(m2) {
   return 1.5 - subpop.id * 0.15;
}

This all pretty much follows a recipe in the SLiM manual's "cookbook" (in section 12.3).  That recipe then goes on to add a CRISPR/Cas9 "gene drive" that drives the introduced mutation through the whole system despite the fact that it is strongly deleterious in the upstream subpopulations.  That scenario is actually what is pictured in the screenshot above; that is why the introduced mutation is fixing even in the subpopulations where it is deleterious.  Without the "gene drive" – as modeled by the code given here – you get a nice simulation of migration-selection balance in an island model.

Of course many details of this model are probably unclear to you now; you can check out SLiM's manual for further details.  Here, the point is to illustrate how simple it is to write a model like this – with complex population structure, both neutral and introduced mutations, monitoring of a selective sweep, spatial variation in selection, and custom output! – in just a few lines of code.  Remarkably, you pay almost no speed penalty for all of this added power and flexibility.  In fact, SLiM 2.0 is faster than SLiM 1.8 was, across the board, and it is often faster than the competition, too.  (We will soon publish a paper with speed comparisons to SLiM 1.8 and a couple of other forward genetic simulation packages, but – spoiler alert – that will be the punch line of the comparison.)

I hope this whets your appetite for more!  If so, surf to the SLiM 2.0 home page at http://messerlab.org/slim/ and download SLiM itself (as an installer app, if you're on Mac OS X, or as a source code archive if you're on Linux), manuals for SLiM and Eidos, and some other goodies.  There are links there to mailing lists that you can join to get announcements (slim-announce) or to ask us questions, start discussions, log bug reports, and so forth (slim-discuss).  SLiM is open-source and free to use (we ask only that you cite our soon-to-be-published paper), so enjoy it – and let us know how it goes for you!

6 comments:

  1. Curious about the speed-up compared to previous SLIM. Is it several times faster or just a few % faster?

    ReplyDelete
    Replies
    1. It depends on the model you're running, of course, but generally it is approximately 2-3 times faster in our testing.

      Delete
  2. This is really awesome! I am going to try this out for a simulation I'm about to tackle :)

    ReplyDelete
    Replies
    1. Great! I would love to hear how that goes. :->

      Delete

A 25-year quest for the Holy Grail of evolutionary biology

When I started my postdoc in 1998, I think it is safe to say that the Holy Grail (or maybe Rosetta Stone) for many evolutionary biologists w...