Sunday, July 26, 2020

Advice to young academics like myself: Be resilient.

Guest Post by Romain Villoutreix

Who am I?

I am currently holding a postdoctoral position in Patrik Nosil's lab at CEFE (Montpellier, France). I am interested in the origin and maintenance of intraspecific phenotypic variation over long evolutionary time. I am in the critical stage of my career were I need to secure a permanent position and funding of my own (7 year past PhD), and hope that the recent publication of a first author paper in Science is a good step in that direction.

Context, the paper.

The existence of multiple morphs of the same species co-existing in natural populations is a common feature of many taxa. These morphs often show discrete variation for multiple traits, all associated with a single chromosomal inversion. Because chromosomal inversions impede recombination between sequences with opposite orientation, it is often assumed that inversions shield multiple selected genes from recombination (supergene hypothesis). But inversions can also harbor adaptive mutations at one of their breakpoints, leading to their rise in frequency without any role for recombination suppression (breakpoint hypothesis). Therefore, a role for recombination suppression in inversion evolution needs to be demonstrated rather than assumed.

Color variation of Timema. Photo credits: Roman Villoutreix

In this paper (here), we show evidence for both hypotheses (supergene and breakpoint) in the cryptic body coloration of a genus of stick insects (Timema). Most species of the genus show intraspecific discrete variation for body color, with green and brown morphs. Green morphs are cryptic on the leaves and brown morphs on the bark of the host plants these stick insects live on. We showed that body coloration is associated with a genomic region of reduced recombination (likely an inversion) both in Timema cristinae and Timema bartmani. Given the size of the genomic regions identified (~10 Mega base-pairs and ~1 Mega base-pairs, respectively) it was nearly impossible to identify the number and identity of genes involved in this polymorphism. Fortunately, a species in the genus (Timema chumash) displayed more continuous variation for body coloration. The study of T. chumash thus allowed us to fine map color and demonstrate that multiple genes are involved in body color variation in this species. Interestingly, these genes are clustered in a ~1 Mega base-pairs (Mbp, hereafter) region of the genome. This region is deleted in the green allele of Timema cristinaeand located at one edge of the 10 Mbp region of suppressed recombination (a putative inversion breakpoint). In Timema bartmani this 1 Mbp region is the sole region of the genome associated with body color variation and, as mentioned above, shows reduced recombination and no sign of deletion in both morphs. This suggest that in Timema cristinae, the mutational event that generated a chromosomal inversion also generated the deletion of body coloration genes at one of the inversion breakpoints and contributes to the evolution of this inversion polymorphism. 

Since the news of acceptance in Science spread, I have had many discussions with PhD and postdocs about the journey leading to the eventual acceptance of this paper. Many told me they would not have had the patience for this endeavor. This pushed me to share this journey to a broader audience, as I think it is a common one for papers published in high profile journals.

The making of the paper.

I started collecting samples for this project, with the help of colleagues, in April-May 2015. We collected and photographed in a standardized manner around 2500 individuals from 10 Timema species. Library preparation and sequencing followed, along with more sampling and sequencing for a new reference genome in April-May 2016. By the end of 2016, I had ran GWAs for all species sequenced and was struck at how strong and clear the results were. Body color always maps to the same linkage group (LG8). The signal is different for each species but the only region always associated with color in this linkage group is the 1 Mbp region which is deleted in green Timema cristinae. We had not yet at this point placed the results in a broader context or understood their significance, but I am already convinced of one thing: I won't often come across results that clear; they deserve patience and special treatment. 

October 2016, Patrik's ERC grant runs out. I am out of a job but am still working on this data. Despite my finances running low, I am really having a blast analyzing it. April 2017, I start a second postdoc in Liverpool with Ilik Saccheri on recent adaption in different species of Lepidoptera in the UK. The project is big and challenging, but I still keep working on the Timema data to some degree at least half of my days. I remember being really uneasy and ashamed doing so, but am very grateful to Ilik for being so kind with me.

October 2017, we 'crack' the story. We understand the difference between Timema chumash and the other species. The results are placed into context and we have a narrative. I start reading like crazy on supergenes and inversion breakpoints (review to come) and we start writing a manuscript. I move to Montpellier in October 2018 to work with Patrik again. Possibly not a very good way to show independence, but I really enjoy working with him and I think this will make my life (and Patrik's) easier in order to get this story out. The 'easy' fun part is over, now starts the long and painful process of submissions followed by rejections...

Resilience in the face of rejection. 

...October 2018. Submission to Science followed by a rejection without review. November 2018. Same story at Nature and Nature Genetics. Submission to PNAS. The manuscript is sent for reviews... but rejected. We revise the manuscript accordingly to make multiple points clearer. February 2019. Submission to Current Biology. The manuscript is sent for reviews... but rejected again. March 2019. Submissions to PLoS Biology and PLoS Genetics followed by rejections without review. This is a huge blow for me. I really start having doubts. Was my feel about the awesomeness of these results right? Is it not enough of an advance in the field? But I never previously had results that clear in my short career, so I decide to keep trying, accepting the potential consequences of this decision: Publishing less papers, and having an article I spent years to publish possibly go unnoticed.

The end of the rejection tunnel

We start revising the manuscript again. The previous version was too dense. We remove the whole ecological side of the story to focus solely on the genetics of color. A meeting with our colleague and friend Mathieu Joron is crucial for our revision. Discussing with him, we understand the breakpoint mutation angle is the most interesting aspect of our results. We remove further results to center the story on this find. September 2019. The manuscript is ready and so different that Patrik is confident we should try Science another time. Rejection… but with encouragement to resubmit! February 2020. We resubmit, adding another reference genome and whole genome re-sequencing data to answer the reviewers’ points (not a small feat… long live the ERC!). May 2020. Final acceptance... I can't believe it, we actually made it! With the Covid-19 lock-down, the epic party I promised Patrik upon acceptance will have to wait... until public release, haha!

Things I learned 

Having a clear paper with limited points (one or two maximum) is key! At the end of the day, we (myself included) are all extremely busy reading enormous amounts of emails and papers. It is not surprising that we find heavy papers very challenging to read. We just don't have the time and mental energy available for many of them. While it certainly sucks on an academic and scientific perspective, I think it is part of the Science landscape nowadays...

Resilience is key! When you really believe in a set of results or a project, just go for it and expect rejection. And expect a lot of rejection. It is part of the publishing/grant process of course, but of life in general. Nothing really good is given easily! When I look at the final print of this paper, I see my resilience in the face of years of struggle and doubts. It is a positive and empowering feeling.

Friday, July 10, 2020

Unravelling little mysteries in the genome of Atlantic salmon

Salmon have always been part of my life. I grew up along the Miramichi River in New Brunswick, Canada, which is a river that is famous for its Atlantic salmon fishing. Even my high school mascot was a salmon named Samoo the Plamu (Mi’kmaq for Atlantic salmon). People travel from far and wide for the opportunity to catch a salmon on the Miramichi, and I have been lucky enough to catch at least one grilse (a salmon that spends only one winter at sea) when fly fishing with my dad. Although my grad studies took me to the west coast to study Pacific salmon, I was glad to have had the opportunity to move back to the east coast and work on a species that has always been close to my heart.

Catching my first salmon on the Miramichi River.

In 2017, I moved to Halifax, Nova Scotia, and started a postdoc with the Bradbury lab. I was excited to get some experience in genomic projects involving Atlantic salmon. Although, I was quickly reminded that Atlantic salmon are complicated. They exhibit a wide range of life history strategies (see Fig. 1), and unlike Pacific salmon, they don’t die when they spawn (they are iteroparous), making things even more complex (in my opinion). Nonetheless, the amount of diversity that exists makes them an exciting species for exploring never ending evolutionary questions. 

Fig. 1. The wonderful and complicated life of Atlantic salmon (Fig. 1 from Gibson and Haedrich, 2006)


One interesting part of the Atlantic salmon story is that Atlantic salmon occupy rivers on both sides of the Atlantic Ocean – in Europe and North America. Salmon from these continents diverged >600,000 years ago, and this divergence has occurred primarily in allopatry, although opportunities for gene flow have occurred. In one of my recent papers, we investigated genomic differences between Atlantic salmon from these two continents.


Atlantic salmon returning to a river to spawn. Copyright: Nick Hawkins Photography

As populations begin to diverge from each other, the genome can start to show variable levels of differentiation, resulting in peaks and valleys of differentiation. Regions of high differentiation between populations can occur through different mechanisms, one being divergent selection acting in opposite directions in each population. These regions often called ‘genomic islands of speciation’ have attracted a lot of attention as these regions may be important for initiating speciation. 

But some question whether these islands of speciation are real…


It is now clear that other mechanisms can produce the same signals of differentiation without divergent selection (Wolf and Ellegren, 2017). This can include purifying background selection, which acts to remove deleterious mutations. In regions of low recombination, this type of background selection can reduce diversity and lead to signals of increased differentiation between populations.


This means that two very different processes can lead to similar signals, so it’s important to consider what mechanisms might be operating in the genome to better understand the speciation process. In our study, we attempted to do just that with Atlantic salmon.


I will admit that the role of purifying background selection is not something that I had given much attention to before this paper. Luckily, another postdoc in the lab at the time, Tony Kess, had spent some time thinking about it already. Tony and I spent a lot of time drinking a lot of coffee and discussing the role of background selection in salmon. Admittedly, as I read more papers and became more caffeinated, I sometimes got more confused. One question that I kept coming back to is ‘how will we know if it’s divergent selection or if it’s just background selection?’. One answer we seemed to settle on was maybe we won’t know for sure, but by using multiple approaches, we can provide evidence that is more consistent with one of these processes.


Tony and me at Moominworld for ESEB 2019 asking Moominpapa about his thoughts on background selection. 

Fortunately, a lot of other scientists have focused their efforts on understanding and identifying the signals associated with background selection, and their work has been a tremendous help for understanding how such processes can shape the genomic landscape. For a nice example on the role of linked selection and recombination in driving regions of high differentiation, I suggest Burri et al. (2015), which investigates this across flycatcher species.


To try to disentangle these different mechanisms in Atlantic salmon, we took note of methods used in other studies. We expected regions under divergent selection (rather than just background selection) to show: 

1) high differentiation

2) high linkage disequilibrium

3) no reduction in recombination rate

4) no increase in gene density

5) signals of positive selection


The question about these differences between European and North American salmon is interesting from an evolutionary perspective, but also important for conservation and management. Atlantic salmon are moved all around the world for aquaculture purposes. The historical use of European salmon for aquaculture in eastern North America has posed problems to some recent conservation efforts in Canada (see CBC article).


In our study (recently published in Molecular Ecology), we utilized genomic data for 26 populations in North America and 54 populations in Europe, which cover a wide range of latitudes within each continent (Fig. 2). The study was ‘spawned’ partly out of curiosity when my postdoc supervisor, Ian Bradbury, asked if any loci were fixed between Europe and North America in our dataset. Upon a quick inspection of the genome, I found over a hundred loci that showed almost fixed differences between continents. But what really got our attention was that a large number of these loci (almost 40%) were localized in one large genomic region. This was news to us, and this led to a more formal investigation of where these large regions of differentiation were located in the genome, and what processes were shaping them. Identifying these regions would also be useful for developing markers that can be used to detect salmon of European origin (or with recent European ancestry) in Canadian waters.


Fig. 2. Location of Atlantic salmon sampling sites in (A) North America and (B) Europe.

In our study, we first found large genomic regions (>1 to 3 million base pairs) showing consistent signals of high differentiation across multiple methods. These were found on four chromosomes in the salmon genome.

Next, for these four chromosomes, we went back to our check list to see if these regions showed patterns consistent with divergent selection. With these data, we confirmed in these regions (see Fig. 3):

1) high differentiation: yes, we found highly divergent regions

2) high linkage disequilibrium: yes, we showed high linkage disequilibrium 

3) no reduction in recombination rate*: yes, we found no significant reduction in recombination rate relative to the rest of the chromosome

4) no increase in gene density: yes, we found no significant increase in gene density

5) signals of positive selection: yes, we found signals of positive selection

Fig. 3. Example of one region showing high differentiation (high FST) between continents on chromosome Ssa06. This region showed no significant reduction in recombination rate and no significant increase in gene count relative to other regions of the chromosome, lending support to the role of divergent selection in driving these differences

Together, these results support that differentiation is not likely due to background selection alone, which is more likely to produce signals of differentiation in regions of low recombination.


*Side note: Originally, I may have calculated recombination rate incorrectly. Before the paper was published, I uploaded my R scripts for the analyses to my GitHub. Arne Jacobs (postdoc at Cornell University – who I’ve met a few times at conferences) kindly reached out to let me know that my calculations were wrong. Turns out, it is not as simple as just centimorgans divided by base pairs. Who would have thought? (probably everyone else!) But, to be fair, there are many different ways to calculate recombination rate. While this was a bit embarrassing, I was happy to have the chance to correct this mistake before the paper was officially published. I was even more happy to find out that this did not change the results/interpretation of the paper. So thank you to Arne for reaching out in a kind and respectful manner, we can always use more kindness in academia! 


Overall, our results were consistent with the role of divergent selection acting to drive patterns of differentiation between continents rather than just purifying background selection. One question remains as to what traits/genes may be under selection at the continental level. As I think about salmon from each of these continents, I think about the diverse landscapes that they live in and the different conditions that they encounter. But we know salmon from Europe and North America are not morphologically distinct, and generally populations are expected to be adapted to local river conditions rather than at a large scale. So one question that weighed on me was ‘what could be showing adaptive signals across such a broad scale?’. We found genes and biological processes that could potentially relate to differences in ocean navigation/migration and immunity. One hypothesis could be that while salmon from each continent migrate to shared feeding grounds in the ocean, they have to travel in different directions to get there, so perhaps differences in ocean navigation may have evolved. I think this is a cool idea that would be interesting to study in the future.


Of course, there are caveats to any study, and we address these limitations in our paper. Future studies using genome sequencing and experimental work would help to better understand the adaptive differences between continents. 


Our study found differences between European and North American Atlantic salmon that may be contributing to early stages of speciation. These differences may explain some partial incompatibilities that exist between continents, and highlight the potential risk associated with the trans-Atlantic movement of salmon provided the currently limited data, high genome-wide differentiation, and largely unknown consequences. More focus on understanding these differences may help inform management decisions in the future as more plans develop to move salmon across the ocean. Luckily, I have recently started a job as a scientist with Fisheries and Oceans and can continue to concentrate on questions related to Atlantic salmon management and conservation.


Our recent paper:

Lehnert, S.J., Kess, T., Bentzen, P., ClĂ©ment, M. and Bradbury, I.R. (2020) Divergent and linked selection shape patterns of genomic differentiation between European and North American Atlantic salmon (Salmo salar). Molecular Ecology 29:2160-2175.


Burri, R. et al. (2015) Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Research 25:1656-1665.

Gibson, J. and Haedrich, R. (2006) Life history tactics of Atlantic salmon in Newfoundland. Freshwater Forum 26:38-45.

Wolf, J.B. and Ellegren, H. (2017) Making sense of genomic islands of differentiation in light of speciation. Nature Reviews Genetics18:87-100.

A 25-year quest for the Holy Grail of evolutionary biology

When I started my postdoc in 1998, I think it is safe to say that the Holy Grail (or maybe Rosetta Stone) for many evolutionary biologists w...