A decade ago, I began my PhD at
Vanderbilt University in Nashville, Tennessee, where I was interested in
studying the evolutionary process of speciation (or how new biological species
evolve). I was very lucky during my PhD to be surrounded by great people.
Case in point, I shared an office for part of the year with a visiting
collaborator, Patrik Nosil, who studied speciation in a group of stick insects
called Timema. Second, my PhD advisor
encouraged me to invite great thinkers on speciation to be part of my
dissertation committee – enter Jeff Feder from the University of Notre Dame,
who studied speciation in a group of fruit-feeding flies called Rhagoletis and served as my external
committee member. These connections made during the beginning of my PhD last to
this day.
Figure 1. The Pancake Pantry in Nashville, TN, USA. |
During a fateful visit to a common grad
student hangout (circa 2007), the Pancake Pantry (Fig. 1), Patrik Nosil and I and a group of graduate students
started discussing the age-old debate about the number of genes involved in
adaptation (and speciation): few versus many? And whether the traits
responsible for adaptation and speciation were polygenic traits or traits with
a simple genetic basis? One way we thought to test this was to use as many
molecular markers as you could survey, distributed across the genome, and ask
the question: how many of these gene regions exhibit significant population differentiation,
but are restricted to populations adapting to different environments? We came
up with ideas of how to test it, and what type of tools we would need, right
over our plates of pancakes! I think we even had a budget by the time we walked
back in our calorie coma from lunch. My
major takeaway from this lunch was that I now considered the genome as an
active player, not a passive mediator, in the speciation process and I would
never think about speciation in the same way again!
What emerged initially from this pursuit were two comparative AFLP genome scans of two different study systems, each
undergoing speciation driven by divergent ecology, that were published in the
journal Evolution (Nosil et al. 2008; Egan et al. 2008). These studies were very informative in
highlighting the proportion of gene regions (AFLPs) in the genome exhibiting strong
differentiation between divergent populations, and possibly addressed the
repeatability of gene regions associated with adaptation to two environments
(in our case, host plants). But we were
also left with many more questions than answers. How were these divergent loci
distributed and arrayed across the genome? And were the loci exhibiting strong differentiation
driven by selection or other evolutionary phenomena?
Fast-forward to 2010 – I finished my PhD and
I was awarded a Faculty Fellowship at the University of Notre Dame, which came
with some seed money for research and the chance to work more closely with my
external committee member, Jeff Feder. Almost immediately upon arriving in
South Bend, IN, Patrik (now in Sheffield, UK), Jeff, and I had a set of
conference calls and email exchanges that started the project that would result
in the Ecology Letters MS I will summarize below. (Jeff and Patrik had just
finished a sabbatical in Berlin the year before where they spent much of their
time ruminating on the genome-level phenomena influencing the speciation
process.) We recruited other evolutionary biologists well trained in Rhagoletis biology (Tom Powell, Glen
Hood, and Greg Ragland), as well as two computer scientists (Scott Emrich and
his PhD student Lauren Assour) with the ability to process the large amount of
data we would gather.
Our interests were to better understand
the role the genome might play in the evolution of new species. We were
inspired by a paper published over 30 years ago by Joe Felsenstein (1981),
where he described the difficulty of building up many-locus differences between
populations if gene flow was ongoing and recombination was breaking up
associations. This conflict between selection and gene flow would form the
basis for our project. How is it that populations can diverge in the face of
ongoing gene flow? What are the properties or characteristics of species that
are suspected of speciation-with-gene-flow which facilitated their divergence?
Figure 2. Rhagoletis pomonella exploring the fruit of the hawthorn tree (Crataegus mollis). Photo credit: Hannes Schuler |
Rhagoletis
pomonella offered a
great study system to test these ideas, as it is a well-documented case of
speciation-with-gene-flow (Fig. 2). Rhagoletis
pomonella is a member of a sibling species complex containing numerous geographically
overlapping taxa proposed to have radiated in sympatry by adapting to many new
host plants from several different plant families. Rhagoletis flies infest the fruits of their host plants, where host
fruits are typically available for a discrete window of time over the growing
season and each fly species completes one generation per year. Adult flies meet
exclusively on or near the host fruits to mate; females oviposit into the host fruit;
larvae consume the fruit, then burrow into the soil to pupate, entering a
pupal diapause that lasts until the following year. Thus, phenological matching
of fly to host-plant fruiting is critical to fly fitness.
The most recent example of a host shift
driving speciation is the shift of R.
pomonella from its native host hawthorn to introduced, domesticated apple,
which occurred in the mid-1800’s in the eastern United States. Genetic and
field studies have shown that apple and hawthorn flies represent partially
reproductively isolated host races and that gene flow has been continuous
between the fly races since their origin. One key trait that differs between
the races is the timing of diapause termination, which varies between the races
to match the 3–4 week earlier fruiting time of apple versus hawthorn trees (Fig.
3). Rhagoletis emerge from their
fruits as late-instar larvae and overwinter in the soil in a facultative pupal
diapause. The earlier fruiting time of apples therefore results in apple flies
having to withstand warmer temperatures for longer periods prior to winter. As
a result, natural selection favors increased diapause intensity, or greater
recalcitrance to cues that trigger premature diapause termination in apple
flies.
Jeff had the perfect experiment frozen in
his freezer from 20 years ago. Previously, his lab had reared the ancestral haw
race of Rhagoletis under the phenological
conditions of both host plants it attacks in nature. He had previously looked
at changes in a set of allozymes and microsatellites, but did not have the ability at the time to look across
the genome at tens of thousands of SNPs. Specifically, he exposed ancestral hawthorn
fly pupae to warm temperatures for a short 7-day (‘hawthorn-like’ control) vs.
long 32-day (‘apple-like’ experimental) period prior to winter (Fig. 4).
We also had a
specific hypothesis we wanted to test that integrated Jeff’s selection
experiment with sampling from natural populations. We tested whether the
changes across the genome induced by the lab experiment on divergent host-plant
phenology would predict the genome-wide differences observed at these same loci
between natural sympatric populations. In this experiment, we stressed that we
were quantifying the total genome-wide impact of selection, which involves both
direct effects, where natural selection favors the causal variants underlying
selected traits, and indirect effects, where additional loci respond because
they are correlated due to linkage disequilibrium with these causal variants. Thus,
the ‘total’ impact of divergent selection (i.e. direct + indirect effects) that
we quantify here can involve changes at many loci (Gompert et al. 2014;
Soria-Carrasco et al. 2014).
Quantifying the
impact of selection genome-wide is important because, as populations diverge,
the effects that individual genes have on reproductive isolation (RI) can
become coupled, strengthening barriers to gene flow and promoting speciation (Barton
1983, Bierne et al. 2011). If predicated solely on new mutations, this
transition could take a long time and populations could go extinct or
conditions change without speciation, which may explain why sympatric speciation
is difficult to observe and test. Thus, a prediction for systems with the
potential for speciation-with-gene-flow is that they exhibit large stores of
standing variation and consequently, show extensive, genome-wide responses to
selection when challenged by divergent ecology.
In our selection
experiment, about 6% of the SNPs showed significant frequency shifts between
the short and long prewinter periods. However, because of extensive linkage
disequilibrium (LD) in Rhagoletis,
these SNPs did not provide an estimate of the independent number of gene
regions influenced by selection. Thus, we assessed the pattern of LD between
SNPs to delimit independent sets of loci.
We determined that the 6% of responding SNPs represented 162 different
sets whose members were in LD with each other, but in equilibrium with all
other SNPs. After accounting for the table-wide null expectation of 52
significant sets due to type I error, using a modeling approach we detail in our
Supplemental material, a lower bound estimate of 110 gene regions responded to
selection. To determine how physically widespread the response was across the
genome, we constructed a recombination linkage map for Rhagoletis that contained 2,352 SNPs. About 13% of mapped SNPs
showed significant frequency shifts in the selection experiment and were
dispersed widely across the five major chromosomes of the R. pomonella genome (Fig. 5). Thus, numerous independent gene
regions responded to selection and they were distributed throughout the genome.
Now we tested
our main hypothesis: does the genomic response in the selection experiment
reflect nature? The answer is yes. The
direction and magnitude of allele frequency changes for all 32,455 SNPs in the
selection experiment was highly predictive of genetic differences between the
sympatric hawthorn and apple host races at the Grant, MI, site (r = 0.39, P <
10-6). Most strikingly, for the SNPs showing significant responses in
both our selection experiment and host divergence in nature, the allele that
increased in frequency in the hawthorn race after selection was the exact same
allele in higher frequency in the apple race in nature (P = (½)154 =
4.4x10-47).
To what extent
did the single bout of selection on hawthorn flies genetically create the
derived apple race? The answer is a good
deal. For all 32,455 SNPs, the mean SNP frequency for hawthorn flies surviving
the long prewinter treatment shifted 38.9% of the difference between the host races
toward apple flies. For the 154 SNPs showing significant responses in
the selection experiment and host divergence, the shift was 84.1%.
Why is the
impact of divergent ecological adaptation so pronounced and pervasive in Rhagoletis? One contributing factor is the extensive LD
in the fly, some of which is due to inversions, requiring additional DNA
sequence analysis to resolve. A second factor is the presence of substantial
standing genetic variation in R.
pomonella, which supports the hypothesis that such stores may define taxa
having a greater capacity for speciation-with-gene-flow. Finally, when
ecological adaptation involves traits like diapause that can be highly
polygenic, selection may more often have genome-wide consequences. In this
regard, microarray studies of R.
pomonella have revealed hundreds of loci varying in expression during
diapause breakage that are potential targets of selection (Ragland
et al. 2011).
Interestingly,
this work shares some important similarities and differences with other recent
studies combining selection experiments with surveys of genome-wide genetic
variation in natural populations, including the Timema ecotypes that are the mainstay of the Nosil lab. In both a
within-generation (Gompert et al. 2014; similar to the Rhagoletis study here) and a between-generation study of selection
in the field (Soria-Carrasco et al. 2014), a genome-wide response involving
many loci was observed. However, LD was much lower in the Timema ecotypes, and thus the association between genetic differences
induced in those selection experiments did not match natural genetic variation
as closely as in the Rhagoletis
experiment.
In summary,
divergent ecological selection can have genome-wide effects even at early
stages of speciation. Large stores of standing variation in Rhagoletis flies may
potentiate the evolution of genome-wide reproductive isolation and their
adaptive radiation with gene flow. As the study of speciation genomics expands,
it will be possible to test the degree to which other taxa prone to ecological
sympatric speciation share similar characteristics as R. pomonella, and to assess the relationship
between standing variation and clade richness.
That was one
productive plate of pancakes!
References:
Barton,
N.H. 1983. Multilocus clines. Evolution
37, 454–471.
Bierne, N.,
Welch, J., Loire, E., Bonhomme, F. & David, P. 2011. The coupling hypothesis:
why genome scans may fail to map local adaptation genes. Molecular Ecology 20, 2044–2072.
Egan, S.P., P. Nosil,
& D.J. Funk. 2008. Selection and genomic differentiation during ecological
speciation: isolating the contributions of host-association via a comparative
genome scan of Neochlamisus bebbianae leaf beetles. Evolution 62: 1162-1181.
Egan, S.P., G.R.
Ragland, L. Assour, T.H.Q. Powell, G.R. Hood, S. Emrich, P. Nosil & J.L.
Feder. 2015. Experimental evidence of genome-wide impact of ecological selection
during early stages of speciation-with-gene-flow. Ecology Letters, online
early. (doi: 10.1111/ele.12460)
Felsenstein J. 1981. Skepticism towards Santa Rosalia, or why are there
so few kinds of animals? Evolution 35:124 – 138.
Gompert, Z., A.A.
Comeault, T.E. Farkas, J.L. Feder, T.L. Parchman, C.A. Buerkle, and P. Nosil. 2014. Experimental evidence
for ecological selection on genome variation in the wild. Ecology Letters 17:
369-379.
Nosil, P., S.P. Egan, & D.J. Funk. 2008.
Divergent selection plays multiple roles in generating heterogeneous genomic
differentiation between walking-stick ecotypes. Evolution 62: 316-336.
Ragland, G.J., S.P. Egan, J.L. Feder, S.H. Berlocher,
& D.A. Hahn. 2011. Developmental
trajectories of gene
expression reveal regulatory candidates for diapause termination, a key
life history transition in the apple maggot fly, Rhagoletis pomonella. Journal
of Experimental Biology 214:
3948-3960.
Soria-Carrasco,
V., Z. Gompert, A.A. Comeault, T.E. Farkas, T.L. Parchman, J.S. Johnson, C.A.
Buerkle, J.L. Feder, J. Bast, T. Schwander, S.P. Egan, B.J. Crespi, & P. Nosil. 2014. Stick insect
genomes reveal natural selection's role in parallel speciation. Science 344: 738-742.