Tuesday, December 27, 2022

Cuentos-Contos Week 3

For the final week of Cuentos/Contos, I present Dr. Bryan Juarez, Dr. Stepfanie Aguillon, Dr. Raul Diaz, Kiersten Formoso and a special interview with Dr. Melissa Guzman. 

Southeast LA and San Bernadino born and raised, Bryan, describes how his low-income background drove him to design novel mathematical approximations to tackle complex science problems (jumping in frogs), as an alternative to using expensive equipment that may have been financially inaccessible. He also explains how his Latinx background prepared him to spot genuine mentors and allies, which have now blossomed into solid friendship. Finally, Bryan touches on experiencing culture shock as a Latino in academia and how EEB departments can support their fellow Latinx academics. 

Stepfanie, a Texas-born, but Arizona-raised evolutionary biologist, tells us about how her immigrant roots influenced her work and school ethic. The importance of building community is discussed, especially with allies that understand DEI issues and the type of mentor she strives to be. Lastly, Stepfanie offers three important recommendations for making academic cultures more inclusive for Latinx students. 

Fervent herpetologist, Raul, remembers reading books on amphibians and reptiles at the local public library and knowing from childhood that her has going to pursue a career in herpetology. He also explains how his parents paved the way for success and supported him, despite not understanding his research interests. Lastly, Rauul argues that a diverse faculty team is th best way to attract Latinx students. 

Kiersten, a New York born but Jersey-raised vertebrate paleontologist passionately describes her reserach exploring land to sea [not sea to land!] evolutionary transformations. She tells us how she shakes off heightened self-awareness (as one of the few Latinx in academic spaces) and addresses barriers that keep Latinx from pursuing science careers. Despites these setbacks, Kiersten has also had many positive expereinces and ramains inspires to pursue a career as a tenure-track professor. 

Check out this special interview with Dr. Melissa Guzman!

To learn more about the featured scientists reach out via their emails or websites,

To check out the full versions of all the Cuentos-Contos, follow this link

I hope you all can take away something from reading the cuentos/contos of so many brilliant scholars and people. Although I only shared the cuentos/contos of 13 Latino/a/x scholars, remember to share your own and tell your cuento/conto. You never know who is reading, and who will be the next, Ecologist, Developmental Biologist, Evolutionary Biologist, Astrobiologist, Microbiologist, Marine Biologist, Paleontologist, etc inspired by your cuento/conto

Friday, December 16, 2022

Cuentos-Contos Week 2

In week 2 of the Cuentos/Contos, I am pleased to share the cuentos of Daisy Flores, Eduardo Tassoni Tsuchida, Alonso Delgado and Maya Yanez. 

Daisy Flores, a San Diego local and marine biologist, tells us how her Latinx identity has influenced the way she approaches education in the US and abroad. Additionally, Daisy emphasizes how a strong support system encourages her to preserve in academia, even during tough times. Lastly, she provides a few suggestions to increase inclusivity in universities and departments. 

Eduardo, a biologist studying cell response to stress, reflects on how his Brazilian background has shaped his grad school experience. He touches on the importance of therapy and keeping up with family during the COVID-19 pandemic. Eduardo believes he is doing his part to increase Latinx representation by mentoring Latinx student and working in various programs and committers to promote inclusivity and community. 

Alonso (originally hailing from the San Fernando Valley), shares his academic evolution: from pursuing an aviation adminstration degree in community college to obtaining a BSc. to currently researching venom changes in off-sea anemones! He also discusses the hidden curriculum, mentorship in academia and why he started organization, "Latinx in Marine Sciences".

Geobiologist and Los Angeles local, Maya Yanez, recounts navigating academia as a first-generation scholar, including the terrifying moments when she found out her loans were denied and how the problem was resolved! She explains how acknowledging and embracing her identity as a Latina has shaped her academic career. Maya candidly addresses her plans for the future, the reasons why she is not considering a career in academia and suggestions on welcoming and retaining Latin students. 

To learn more about the featured scientists reach out via their emails or websites, Eduardo Tassoni Tsuchida - etassoni@stanford.edu, Maya Yanez - Mdyanez@usc.edu, Daisy Flores dmflores@utexas.edu, and Alonso DelgadoHome | Alonso Delgado (delgado73.wixsite.com).

To check out the full versions of all the Cuentos-Contos, follow this link. Don't forget to view last week's cuentos/contos

And for now, Hasta luego! 

Friday, December 9, 2022

Cuentos-Contos Week 1

 In the Fall of 2021, I sought out to highlight the stories of Latino/a/x researchers generally in "Ecology and Evolution". With the support of Dr. Carly Kenkel, Dr. Oliver Rizk and Emily Aguirre, we were able to put on a series titled "Cuentos-Contos" (short stories in Spanish and Portuguese) to be shared to our Marine and Environmental Biology section at the University of Southern California. Now I am sharing these short stories to Ecoevoevoevo to share with a wider research community as I believe it is important to highlight researchers from similar backgrounds in academia as this space is often isolating. Throughout the next couple of weeks, I hope you feel inspiration, empathy and joy as every person details how their career, passion and aspirations intersect with their culture and identity. To start off, I highlight several talented scientists: Emily Aguirre, Ivan Moreno, Melody Aleman, and Dr. Suzana Leles.

Angeleno microbial ecologist, Emily, studies algal-bacterial symbiosis in the emerging cnidarian system, Aiptasia pallida, using genomic, culturing and microscopy techniques. In this cuento, she highlights her support system as a "non-traditional" student and discusses inefficient, outdated and harmful academic structures. Emily concludes by suggesting solutions for improving the academy, and transforming it into a space that truly supports talent, ingenuity and diversity. 

 Long Beach raised microbial ecologist, Ivan, studies microbes in extreme environments via genomics. He discusses the importance of a work-life balance as an underrepresented student (soccer and video games!) and how this keeps him grounded. Ultimately, Ivan believes that if he strives to be the best scientist and researcher now, he will be able to provide those same opportunities for others once he's an established academic.

Pennsylvanian marine microbial ecologist, Melody, breaks down the racial, social and class obstacles she faced as a Latina, on her way to grad school. She addresses the importance of utilizing the university's mental health resources, and how this has helped her cope with the global pandemic and anxiety. Lastly, Melody gives a shout out to her former/current mentors and encourages departments to support their Latinx students through more funding opportunities and access to genuine mentorship.

Brazilian oceanographer, Suzana, discusses he academic trajectory across the globe, and how she ended up building mathematical querying microbial food webs in Los Angeles, California. She addresses the supporting factors (and discouraging aspects) that allowed her to succeed and become a Ph.D., despite enduring hurtful experiences and how she continued on an academic track. Finally, Suzana provides helpful tips on building welcoming spaces for non-native English speakers!

To learn more about the featured scientists reach out via their emails or websites. Ivan -  imoreno[at]ucsd.edu, Melody - maleman[at]usc.edu, Suzana -  suzanaleles[@]gmail.com, Emily - Emily Aguirre (weebly.com)

To check out the full versions of all the Cuentos-Contos, follow this link

And for now, Hasta luego! 

Sunday, December 4, 2022

Within, among, or between?

I was recently surprised to learn from my students (I really appreciate that they spoke up about it) that some phrases I had been using were confusing - most specifically among-population variation.* This led to a discussion of the meaning of within, among, and between and how these terms are used in ecology and evolution (a microcosm of how they are used more generally). As the confusion appears to be more common than I thought, perhaps it is worth explaining the situation here.

To make this explanation clear, first imagine that you are analyzing a number of separate populations (e.g., humans in different populations) and that you have measured a particular trait (let's say body size) in a number of individuals in each of those populations. Note: this is not a random example, it is precisely what we did in a paper some years ago - McKellar et al. (2009). Below is a figure from that paper providing compilation of within-population variation (y-axis) and among-population variation (x-axis) measures (here the "coefficient of variation" - CV) within a large number of animal populations.


If you report a descriptive statistic for each population separately, those measures are within-population summary statistics. Thus, within-population variation is a measure of variation within each of those populations - as might be indexed in a variance or standard deviation or coefficient of variation of body size for each of those populations separately. You can then also calculate the average or variation (across populations) of those within-populations measurements. In this case, you use the various within-population measures you have calculated (e.g., the variance within each population) as data to calculate another set of descriptive statistics, such as the mean (across populations) of the within-population variance. 

I should note that, in some cases, one wishes to assume that these within-population measures are all estimating the same global (that is, shared across populations) within-population variance (or mean or whatever). In such cases, it can be assumed that populations with larger sample sizes (more individuals measured) are providing better estimates of that shared (common across populations) within-population parameter - and so the estimated average (across populations) of the within-population parameter is calculated by weighting the within-population estimates by their sample sizes. This is precisely what is done when one calculates a "pooled standard deviation." Of course, variation among the within-population estimates of the parameter are a measure of how much variation might exist among populations in those within-population parameters.

Between and Among

If you next report a descriptive statistic that examines trait variation across the populations, then you are in the world of between-population (if across only two of the populations) or among-population (if across three or more populations) estimates.** Typically, these estimates do NOT include variation within those populations. That is, you don't simply pool all of the individuals across all of your populations and calculate a single mean or variance - because this approach mixes within and among population variation.***

So, instead, the simplest approach is to take the within-population parameter estimates, such as the within-population means and variances of trait values for each of populations you measured, and use them as data points to calculate a new mean and variance. The first of these (the mean of the means) was mentioned above as it is the mean of the within-population means - and thus the "best" estimate of the trait mean within populations (assuming they are the same - or near enough as to make no difference). The second of these (the variance of the means) is a measure of among-population variation - that is, it is the variance among population means. It is the among-population variance.

Of course, the within and between population contributions to variation can be estimated together from an appropriate statistical model (e.g., nested analysis of variance) that appropriate partitions the variance between the different levels. Further, uncertainty associated within lower levels of the hierarchy (e.g.. variance within populations) can be propagated in some models (e.g., Bayesian) up to higher levels of the analysis (e.g., variance among populations).

I above noted that estimates of among-population variance should not include within-population variance. However, some analysis are interested in scaling the among-population variation by the within-population variation. The simplest way to do this is to divide the among-population variance by the within-population variance - and versions of this are seen in the estimation of parameters such as FST, QST, and PST.


* When used as an adjective preceding the noun, you want to use a hyphen (e.g., within-population variation) but, in other situations, you don't want to use a hyphen (e.g., the variation within populations). 

** The terms "between" and "among" are also used more generally in writing when you are discussing analyses that are contrasting only two populations (between) or when you are contrasting more than two populations (among).

*** As an aside, this is one of the issues encountered when performing PCA on data from multiple populations simultaneously. That is, PCA (as opposed to DFA) ignores population identity and thus generates axes that combine within-population and among-population variation, which can generate considerable biases. Note: I am not saying PCA can't be used in such instances - but rather that it should be used with caution.

Thursday, December 1, 2022

SLiM 4: Multispecies eco-evolutionary modeling (a personal history)

Once upon a time, I did my PhD with Andrew Hendry at McGill.  My PhD involved writing individual-based evolutionary models of various sorts, to look at things like local adaptation, adaptive divergence between environments, and speciation.  Each model I wrote for my PhD was bespoke – a custom model, with custom C code to simulate what I wanted to look at for a given project.  (I did write a general-purpose modeling environment within which I implemented each of these bespoke models, which provided graphical visualization of the running models for me; but the models themselves were each coded by hand.)  Each model would have its own parameters, governing things like population sizes and migration rates; each would have its own implementation of some sort of genetic architecture; each would have its own approach to selection and fitness.

But now and then, Andrew would get a gleam in his eye, akin to the gleam in Gandalf's eye when he smoked his pipe and talked of strange lands and great heroes.

Gandalf's gleam in the eye

Of course Andrew would be sipping whiskey, not smoking a pipe!  And when Andrew was lost in these distant thoughts, he would sometimes speak of "One Model to rule them all".  One Model to rule them all, One Model to find them, One Model to bring them all, and in the cluster... simulate them.

The One Ring Model

What he meant, of course, was that it would be great not to need to write each new model from scratch; it would be great to have one "uber-model" which could do everything, and then each particular model that one wanted to explore would just be a particular parameterization of that uber-model.  Concepts like "migration", "selection", "population structure", and "genetic architecture" are – one could argue – general concepts that you would like to be able to code once and reuse, over and over.  Once the uber-model was written, you would never need to write a model again.

In its pure form, this idea is obviously a pipe dream; one could never write an uber-model that is so flexible, so general-purpose, so Platonic, that every other model one could imagine is just a shadow cast by the uber-model upon the cave wall.  It's an attractive vision, but there's no way it could ever be real.

And yet the idea stayed with me.  Perhaps not an uber-model, as such... but perhaps a modeling framework.  Perhaps one could write a modeling framework that would provide lots of tools and utilities, building blocks for model-building.  Writing any particular model could then just be a matter of glueing together the provided building blocks.

After my PhD, I started working with Philipp Messer at Cornell University.  Philipp had written a population-genetics simulator that he named SLiM, and he wanted somebody to improve it and maintain it.  Since 2015, I've been chugging away at improving SLiM, step by step.  It now provides a cornucopia of building blocks, for everything from genetics to spatial modeling; it provides a scripting language called Eidos with which you can glue those building blocks together in whatever way you wish; and it provides a graphical modeling environment in which you can write your Eidos scripts, run them, and see the resulting evolutionary dynamics visually as the model runs.  It's pretty widely used in population genetics (SLiM 3, SLiM 2, SLiM 1), and has enabled a lot of cool research.

SLiM's icon, with a tip of the hat to Piet Mondrian

But SLiM hasn't really been used as much as it could by people interested in evolutionary ecology, in eco-evolutionary dynamics, in predator–prey systems and host–parasite systems and things of that sort.  The reason is that SLiM didn't really have much ecology.  It started out as a population-genetics simulator and it stayed in that world for a long time.  You could simulate a biological system from the level of mutations, to genes, to chromosomes, to individuals, to subpopulations, to a whole species; but you couldn't really model more than one species, and the interactions between those species, and the coevolutionary feedbacks driven by those interactions.  So it remained a tool mostly for population genetics.

I am very pleased to announce that that era is over!  SLiM can now model evolutionary ecology: multiple species, interspecies interactions, coevolutionary dynamics, and eco-evolutionary dynamics.  It now spans the biological hierarchy from individual mutations up to not just a species, but a whole ecosystem or even a community.  I'm really, really excited to see what folks do with this; for me, this is the realization of more than a decade of dreams.

Support for multiple species was added to SLiM 4, which was released on 12 August 2022, so it has actually been available for a little while now.  I put off writing a blog post about it here until the corresponding paper was in the publication pipeline... and now it is, in the American Naturalist.  The title is "SLiM 4: Multispecies eco-evolutionary modeling".  At present you can download the paper in its "just accepted" form; it hasn't been typeset yet.  Here's the DOI: https://doi.org/10.1086/723601.

I'm not going to say anything more about SLiM 4 and multispecies modeling, because, well, that's what the paper is for.  Of course this is not the end of the journey.  I'm sure there are lots more building blocks that will need to be written to make multispecies modeling as flexible and general as we want it to be; and there are lots of other projects too, from improving and generalizing SLiM's genetics to making SLiM run faster by utilizing multiple processors.  But ever since I started working on SLiM, my primary end goal for it has been to turn it into an ecosystem simulator – really, to try to bridge the gap between population genetics and evolutionary ecology by making it possible to simulate both in the same model.

And if this obsession with the dream of the One Model has consumed my life a bit, and turned me into a troglodyte that flinches away from the sun, well... it has all been worth it for my preciousss.

The author, with a fish.

Sunday, November 27, 2022

Grammar tips/rules for scientific writing

In my roles as supervisor, collaborator, reviewer, and editor, I read many scientific papers in draft (pre-publication) form. When reading, my hope is always to concentrate on the science itself - and how well it is communicated. Sometimes, however, I get stuck on particular grammatical errors and find myself repeating again and again and again various grammar "rules." I provide a listing of them here in hopes that they are picked up, used, and propagated just a bit more than at present.

1. Avoid long, complicated compound sentences. These are often very difficult to follow.

2.     Use “which” and “that” properly. “That” should be used for restrictive clauses (“This is the fish THAT Jack caught.) whereas “which” should be used for nonrestrictive clauses (“This fish, WHICH Jack caught, is a salmon.”) Most people use “which” in many cases where “that” is more appropriate.  

3.     Avoid all use of “there is”, “there was”, “there are”, and “there were”, particularly at the start of sentences. Use of these terms can make the subject of the sentence unclear.

4.     Avoid unnecessary amplification of text. For example, say “sneaky mating is successful” rather than “sneaky mating has been found to be successful”. 

5.     Avoid the use of “while”, except when the intended meaning is “during the time that.” In other contexts, “whereas” or “although” are usually better.

6.     Write out all numbers less than 10 (i.e., one, two), unless the number is followed by a unit, such as m, mg, min, h, etc.

7.     “Data” are plural. That is, you don't say: "the data is", you say "the data are." Datum would be the singular version.

8.     “Between” is used in reference to two things. “Among” is used in reference to more than two things. That is, you study the differences between two populations, but the differences among three populations.

9.     Never use “etc.”

10.  Never use “unique” unless you truly mean “one of a kind.” People often say: “Our system represents a unique opportunity to test the theory that…” Instead, say: “Our system represents an excellent opportunity to test the theory that…” Similarly, never use “ideal” or “perfect” in this same context.

11.  My Mom (a grammar expert of sorts) tells me that only God “creates” things (and she isn’t even religious). So, in short, don't use the term create unless you are invoking God.

12.  Strive for parallelism between related sentences that appear close to each other. As a simple example, use “Low predation sites are characterized by few fish predators. High predation sites are characterized by many fish predators.”, instead of “Low predation sites are characterized by few fish predators. Many fish predators are found at high predation sites.”

13.  Beware of misplaced modifiers. For example, “We measured body depth using calipers.” Body depth does not use calipers, as this sentence implies. Instead, use “We used calipers to measure body depth.” Sometimes it is difficult to avoid misplaced modifiers without otherwise destroying the sentence. In such cases, it is forgivable.

14.  Use the active voice (“We measured body depth.”), rather than the passive voice (“Body depth was measured.”), whenever reasonable and when not explicitly disallowed by a journal. Be careful to not use it too much though. Six sentences in a row, all starting with “we”, are very awkward.

15.  Although many would disagree with me, I believe in the power of punctuation. As one small example, I believe the second last phrase in a list of phrases should have a comma before the “and.” For example, “Speciation can occur by genetic drift, mutation, and natural selection.” rather than “Speciation can occur by genetic drift, mutation and natural selection.” Using the latter often introduces confusion when the phrases themselves are longer and contain “and” within them. The cartoon gives another example:

16.  Always use a single space between sentences. All journals do this anyway, and it makes editing difficult if one person (me) uses single spaces and other people (you) use double spaces.

17.  Try not to use “may” unless you are implying permission. Instead consider “might” or “can”.

Wednesday, September 7, 2022

NSF Postdoc Fellowships

The following is a guest post by Dr. Alli Cramer, at the University of Washington. @AlliNCramer

How do NSF postdoc proposals work, anyways? 

Since the Ocean Sciences Postdoctoral Research Fellowship (OCE-PRF) has just been announced, it seems like a good time for a quick discussion of how to apply, or how to begin thinking of applying for NSF postdoc fellowships! Many of these are due in early November so as of September prospective postdocs have about 10 weeks to refine their projects. This is a modification of a twitter thread I wrote a year or so ago, but it does have some extra information if you’ve already seen it. 

My experience applying for PRFs comes from applying for the ‘Postdoctoral Research Fellowship in Biology”, PRFB, in 2019 and 2020, and applying for the OCE-PRF in 2021. With that as my background, some of this advice will be program specific, but much of it is some of the ‘unwritten rules’ of NSF so hopefully it can be helpful to other fellowships as well. Ultimately,  I was funded on my 3rd attempt at an NSF postdoc and getting to that point was quite a learning curve. In particular, I didn’t know what to expect regarding timeline or paperwork.

Proposal Preparation

First, and definitely the most important -  connect with Program officers (or Program Directors - seems to vary by division). Do this as early as you can, and feel free to check in with them multiple times: their names are listed on the NSF website for your specific proposal. As a graduate student it can be intimidating to reach out to Program officers, but you should 100% email them and discuss your proposal idea. It is the job of Program Officers to help you make sure your proposal fits the brief of the solicitation before you submit it. They can also answer questions you have about formatting or paperwork. For proposals due in November, contact them now to start refining project ideas.

While drafting the proposal attend the Q & A session(s). These are important to clarify solicitation language and answer questions you didn't know you had - You don't want a proposal rejected because of a formatting error! Make sure to go to the session or get detailed notes from someone who did. The Q & A session dates are listed on the NSF page for your proposal. Sometimes these are also listed on the solicitation (the hella long HTML page with all the specific language) but not always. They’re normally listed as important dates on the website that links you to the solicitation itself. As of this blog, some programs now offer Office hours - these are great places to get questions answered and connect with the Program Officers (double whammy!).  

Like the Program Officers, the IT at NSF is an excellent resource for you. Proposals are submitted through an online portal (currently Fastlane, though that is changing). If you have questions or if something isn’t working, reach out to IT. I had computer issues uploading a proposal and they responded fast and fixed the problem.

Because of the jankiness of the upload portal, upload drafts of your proposal early. Like, two or three days early. Every time that I submitted proposals I tweaked files up until the deadline, but I made sure I had a good enough copy of each file uploaded a few days before. This was useful because it let me see what wasn't working (and led me to contact IT). 

When you’re writing your proposal, in addition to the description of your project and the budget etc., you will need to have letters of support from your potential mentors. Make drafts of support letters for mentors that they can work from. Mentors can use your draft as a springboard and rewrite it, but your draft will help them understand the role they have in your project more clearly. Writing it out for them not only saves time, but forces you to be explicit about your mentorship goals and needs. 

After submission 

After you submit your proposal the earliest you can expect to hear back is ~ 3 months. If your proposal is recommended or declined, you should hear around the same time. If you haven’t heard anything by then it doesn’t mean you weren’t funded, but it doesn’t mean you were. There is always a batch of proposals that NSF would like to fund, but that are low on the priority list. If you can resist, avoid constantly refreshing the status page 😛If you haven’t heard back & have deadlines looming (accepting job offers, etc.), reach out to the Program Officer. They are super helpful & responsive - they helped advise me when my proposal was in limbo, even when everything was a mess due to COVID shutdowns. They can’t tell you if your proposal will be funded, but they can give you insight into timelines, etc. 

If your proposal is selected, you will hear back over email - make sure to check those spam folders. You will need to send back paperwork to accept the award. In my experience, this has a tight deadline (less than business 5 days) so you will need to work fast.

The paperwork involves coordinating with your host institution and NSF. You might need to get a version of your proposal through the institution’s research grant office. Get in contact with your host institution’s department’s coordinator/grant manager/director because they are the experts. 

For my proposal I also needed to draft a letter “concurring with the transfer of the award to the host institution.” I couldn’t find any examples of those online, but I drafted one up using the standard business style letterhead. My letter went like this: 

Dr. Allison Cramer

[Home address]

[Phone number]

[Program Director]

[Fellowship title]

National Science Foundation

2415 Eisenhower Avenue, 

Alexandria VA 22314


To the [fellowship name] Program, 

This letter concurs with the transfer of Proposal ID ##### [proposal title] to the primary host organization, [institution name]. 


[signature block]

After the letter and all the other paperwork is sent back there is another batch of waiting. During this time your proposal status page might not change, and the only “proof” you have that you got funded is re-checking your email compulsively. After a few weeks the proposal status shifts to Recommended & a few days later you will receive emails that your proposal is being funded and the status changes to Awarded. Some of these emails are auto generated so have weird subject lines (so check spam folders).

Proposal Feedback 

Whether it was funded or not, after you hear back you will get feedback from proposal reviewers. This feedback includes a summary and individual reviewer thoughts about your proposal. The summary of proposal reviews is most important - it synthesizes individual feedback to highlight what matters. For example, one reviewer for my funded proposal found aspects of my proposal unclear in their written feedback; in the summary however this wasn’t mentioned at all. The other two reviewers understood that part of my proposal, so it was hashed out among the reviewers in the in person discussions they had. In contrast, on one of my unfunded proposals two reviewers highlighted a gap, and that gap was again emphasized in the summary feedback. This let me know to focus on it for my next attempt (the successful one!).

Here I am going to plug Program Officers once again. You can contact them about these reviews and they can help you make sense of the feedback. They are ‘in the room’ when the discussions happen, so can help identify what to prioritize for revisions should you resubmit. In general, postdoc proposal or not, contacting program officers is good practice for any researcher looking for NSF funding. It is essential for connecting with NSF programs, and for parsing solicitations. NSF wants to fund good science. The Program Officers help researchers frame their questions and put their best proposals forward. 

All of the above info is my experience with NSF. If you have questions about being in this strange postdoc stage, feel free to connect with me on twitter @AlliNCramer. You can DM me and I can point you in the right direction. Good luck to all of you writing those postdoc proposals!

Tuesday, August 23, 2022

Collaboration: a how-to guide

 Collaboration: a how-to guide


Dan Bolnick (University of Connecticut)

Stacy Krueger-Hadfield (University of Alabama at Birmingham)

Alli Cramer (University of Washington Friday Harbor Laboratories)

James Pringle (University of New Hampshire)

While on the Isle of Shoals at the Training and Integration Workshop for the Evolution in Changing Seas Research Coordination Network, we were asked to serve on a panel about the Challenges of integration - importance of language and frameworks in interdisciplinary collaborations. We received interesting questions from the audience (mostly students and postdocs) that revealed to us some general concerns about how to collaborate, that might benefit from a summary available to a broader audience, hence this post. Here, we begin by describing some of our interdisciplinary collaborations as examples. Then we provide a general how-to guide, beginning with how to start by finding collaborators, how to set up agreements to manage expectations, and how to avoid common pitfalls.

What do we mean by ‘interdisciplinary collaboration’?

It is possible to have endless debates on the true nature of interdisciplinarity. Perhaps it is best to say collaboration with people who can provide skills or perspectives that lie outside your core competency. It is collaboration with folks not just for their individual insights alone, but for their broad background. As a means of introduction, the panel discussed their collaborative projects. 

Dan: My PhD and postdoctoral training were very much focused on core topics in evolutionary ecology, touching on subjects like speciation, maintenance of genetic variation, selection arising from species interactions. I liked the idea of interdisciplinary collaboration, and I was able to observe some from a distance (my PhD mentor, Peter Wainwright, hired a postdoctoral researcher with a fluid dynamics engineering background to work on fish feeding). But I didn’t know where to begin: who to reach out to, and how to bridge fields. I had many collaborations, but most were with people in similar departments to my own. In the past decade I’ve tried to hire a more intellectually diverse set of postdocs, bringing together evolutionary biologists with geneticists, immunologists, and cell biologists, to try to generate some synergy. But recently I’ve established a few collaborations that have really stretched my boundaries. A couple years ago I received a Moore Foundation grant to collaborate with an engineer and an immunologist (the latter a former postdoc from my lab) on studying host-microbe interactions using microfluidic chip artificial guts. The visit to Dr. Rebecca Carrier’s lab in engineering was a genuine thrill, to see how engineers approached a problem. Then last year I received a grant with a computer scientist (Dr. Tina Eliassi-Rad) and statistician (Dr. Miaoyan Wang) to study the evolution of transcriptomic networks. These collaborations are fantastic because I really get to branch out and learn a little bit about entirely new fields, expanding my own horizons. And hopefully the projects will yield exciting insights that wouldn’t have been possible had I tried to tackle this on my own. 

Stacy: I was trained as an evolutionary ecologist. To a certain extent, working at the interface of ecology and evolution necessitates some level of interdisciplinarity and collaboration. I’ve been smitten with algae since a phycology course as a senior at Cal State Northridge. My two PhD supervisors - Myriam Valero and Juan Correa - taught me a lot about collaboration and a more holistic approach to a rather fundamental question in biology - how and why did sex evolve. While I was part of several projects that were interdisciplinary in nature, I really began to explore ‘interdisciplinary’ collaboration and working with colleagues with distinctly different skill sets as an Assistant Professor at the University of Alabama at Birmingham. For example, I have had the opportunity to be part of an ANR-funded group called CLONIX-2D. While the threads of partial clonality connect everyone in CLONIX, we’re combining researchers with very different skill sets and focal taxa. This approach has the opportunity to yield breakthroughs, but presents challenges with language. What does one group mean by such and such? How do we define the jargon that is inherent to different taxonomic groups? Are we separated by a common language? Funnily enough, the CLONIX group is mostly French, with the exception of myself and Maria Orive as the North American delegation. So, we are, to a certain extent, separated by language before we even delve into biological vernacular. As an extension of CLONIX, the coordinator, Solenn Stoeckel, and I started talking about better methods and descriptive statistics for haploid-diploid taxa. Our musings occurred at a small conference - in another lifetime before COVID - and in thinking back as I help write parts of this post, such impromptu, serendipitous chats can lead to brilliant collaborations. Nevertheless, Solenn’s theoretical population genetic doodles were and are nothing if not daunting. I had to first conquer that feeling of intimidation and figure out a way to communicate where existing tools fell short for my day to day data analyses. Not only did we have to cross a barrier between French and English, but also in scientific terms and concepts (red algal life cycles are not for the faint of heart). Bridging our linguistic gaps isn’t something that we figured out overnight and is still a work in progress. I think a good collaboration - whether in your specific field or in a totally different discipline - will continue to grow with time. Solenn and I have produced two papers (here and here), with a few more in the pipeline - all started from idle musings over coffee. Clear precise language is worth its weight in gold - from the outset of a collaboration to the point where you begin to see your work to fruition (i.e., a paper).

Alli: My Masters and PhD training were interdisciplinary as a matter of course. As a PhD student in an inland environmental science department, incorporating management and societal concerns were the norm and, moreover, I was the only marine ecologist in my department - in nearly the entire university. This interdisciplinary environment worked well with my research focus: ecoinformatics and integrative ecology. My specialty involves synthesizing disparate data sources to ask new questions, or test old questions in new ways. While a PhD student my peers were limnologists, hydrologists, data scientists and engineers. Discussions with them about data sets and the ubiquity (or not) of lakes shrinking led to an interdisciplinary project synthesizing data from over 1.4 million lakes into a publicly available data set - the Global Lake, Climate, and Population data set. Other interdisciplinary projects I pursued were founded through the EcoDAS Symposium; a workshop aimed at connecting early career researchers in the aquatic sciences. Through this workshop I worked on an integrative ecology paper testing Grunbaum’s (2012) scaling predictions and a social-ecological frameworks paper discussing marine no-take zone management. My current research for both my postdoc and the RCN are both interdisciplinary - one at the intersection of community ecology, geology, and hydrology and the other connecting genetic and spatial aspects of population connectivity. In my experience, interdisciplinary collaborations are where new and exciting questions lie. To find them, seek out existing frameworks  which connect researchers across disciplinary lines, such as EcoDAS or an RCN. 

Jamie: In my case, I am trained as a physical oceanographer and have approached biologists to satisfy my curiosity about how species can persist at a place in the face of currents that sweep their offspring away. In turn, I have attracted the attention and collaboration of biologists and chemists who are curious how ocean circulation affects the things they study. Once you are known for broad interests and attain a reputation for putting effort into collaborations, it is easy to attract more collaborators. Before then, you will have to initiate collaborations and then follow through on what you have started. My best collaborations have been with other scientists who are genuinely interested in how the ocean affects their system, as opposed to those who want to figure out how to set up controls or systems which effectively eliminate the impacts of the ocean’s flow on the system they want to study. There is nothing wrong with the latter strategy – but it is less intellectually interesting for me. 

Benefits of collaboration

Some of the themes that emerge above (and from our group discussion) are that collaboration outside of your own area of expertise brings quite a few benefits. From a strictly intellectual standpoint, merging distinct perspectives and skills provides synergistic insights that can lead to new ideas, or to conclusions you might otherwise miss. Or, collaboration may provide a new combination of technical know-how to acquire or analyze data in new ways. Financially, collaboration can enable access to a more diverse set of funding opportunities by making it possible to apply for research grants from multiple agencies (e.g., NSF and NOAA) or different divisions within NSF. Within NSF for instance there are programs like the Rules of Life that specifically require interdisciplinary collaborations (defined as having co-PIs who are normally funded by entirely different directorates within NSF). Lastly, setting aside the utilitarian issues, collaboration can be great fun. It is a chance to make interesting new friends and learn from each other. If the collaborators are from far-flung places, you then may get to convene group meetings at interesting places. Collaborations that begin with curiosity and continue to produce joy are much more likely to lead to interesting results than those that are put together for purely utilitarian reasons.

Some risks

Collaboration is not always good, whether interdisciplinary or not. Luckily all four of us have been blessed with many excellent collaborators; this is more a response to an audience question than a commentary on our own direct collaboration experiences. Collaborators may turn out to be uncooperative, unresponsive, or unpleasant. Many of these risks depend on your career stage.  As a student, big collaborations may not be ideal as this is the time to build your foundational area. Rushing to collaborate may stretch you too thin, when you need to be establishing your name as the go-to person on a particular subject. Moreover, large collaborative efforts often take time to produce products. These might not align with a student’s degree progression and milestones. Let’s say you start a collaboration that relies on another lab to generate some key data that you need to interpret your own results, or to build some piece of technology. If they don’t deliver quickly, you may be left waiting a long time for their product before you can proceed. That’s time that faculty can often spare, especially once tenured, but can be awful for a time-constrained fifth year graduate student. Some of these concerns likely also apply to post-docs, fresh out of their PhDs that need papers and maybe can’t wait years for products. Similarly, for assistant professors, too much collaboration may not be seen as ‘independent enough’, posing problems for promotion and tenure review. As the P&T season begins anew, there are some interesting social media posts on tenure packets (e.g., Holly Bik’s thread). One piece of advice I (Stacy) received when putting together my dossier last year was to have a table where you and your lab members’ contributions to each paper were clearly outlined and described. A table like this makes such information, especially if you are not a first, last, and/or corresponding author abundantly clear. An added bonus when the table is completed is to see just how much you and your lab have accomplished in the last few years!

How to get started

The hardest part of starting a collaboration is finding committed collaborators. You can find potential collaborators at conferences: making a connection is certainly easier when we’re in the same venue and can chat at a poster session or over a meal. If you are proactively interested in starting a collaboration you might even consider going to a meeting outside your core discipline (e.g., Jamie, an oceanographer, coming to a meeting of biologists and geneticists). Other options include word of mouth, Google searches for key words, reading the literature in the area in which you wish to expand, or asking a colleague in that field. You might even use social media to put out a call for collaborators in a topic. Serendipity can play a role too, chance meetings outside of normal academic settings, or incidental connections between third parties. As a specific example, Dan’s collaboration with a computer scientist (Tina Eliassi-Rad) and statistician (Miaoyan Wang) began with a couple of these. A former student (Sam Scarpino) was new faculty at Northeastern University in the same Network Science institute as Tina, and in a zoom call to brainstorm a NSF Rules of Life proposal Sam suggested Tina would be a great addition to the team (and she is!). Then looking for a statistician, Dan used some google scholar and google searches to figure out who was doing cutting-edge, well-cited work on statistical analyses of networks, and found Miaoyan’s name. Serendipitously, a former postdoc had just met Miaoyan (both being recent hires at Wisconsin) and recommended her. A couple of emails later and we had some zoom meetings arranged, and six months later had the grant funded!

One key consideration is making sure your new collaborator(s) has the time and resources to hold up their part of the bargain. You are asking someone to step outside of their day-to-day role, and that takes effort and commitment, and may not be strategically in the best interest of their career. So don’t be offended if they don’t have time for you, but neither should you collaborate with someone who won’t be responsive, or if you will not have time to be responsive. (NOTE: we aren’t implying any of our collaborators are an issue in this regard, this is in response to a question from the panel audience). A simple test is to be sure that this new collaborator puts some skin in the game. A good way to start is with a proposal of some sort – if they do not contribute to both their part of the proposal and communicating to improve other parts of the proposal, they are unlikely to be good collaborators in the long run. If writing a grant proposal together, require real text contributions and editorial effort. If that doesn’t materialize, heed the warning that red flag is sending and maybe find someone else. If that new and also unknown collaborator isn’t putting time in now at the grant writing stage, they maybe aren’t really committed. This is a hard pill to swallow and a tough lesson to learn down the road when there are more entanglements and it becomes harder to get out unscathed. Do not hesitate to say no to requests for collaboration if you do not have the bandwidth to contribute fully or if the other person has a likelihood of not contributing fully. And if someone isn’t contributing, move on to find a new collaborator. Better to “dump” someone on the first date, so to speak, than to commit and change your mind once you are funded. A caveat here is that grant writing is a great litmus test for faculty, but not always an option for graduate students.

You're in the same r(z)oom ... now what? 

The contract: One strategy to ensure collaborations will be successful is to start, from the outset, with collaboration agreements, outlining participants’ roles and expectations. Initial meetings of course will identify what each person is expected to do, but often these are done informally leading to verbal agreements that can later be forgotten. Even if it seems overly formal, we cannot stress enough the value of written agreements. It keeps everyone honest. In one multi-PI project Dan is involved in, which spans over a dozen PI’s lab groups, each PI on the team contributes a Scope of Work document (a template can be found at the end of this post), that specifically outlines each person’s obligations - what data or tools or written products will be delivered, to who, and when, with what funding. That way each person’s role is on paper, and if someone else tries to join in you can check their proposed scope of work against the many existing scope documents to make sure there’s no redundancy / conflict. It’s a great way to avoid the “I thought you were doing that” conversation later in the game. Note, each collaborator provides their own unique scope of work document detailing their particular contribution. This can be supplemented with a Collaborative Agreement, again a formal written text, that lays out more universal (rather than lab-specific) expectations. This is especially important for interdisciplinary collaborations where you are bringing together people from different fields who therefore have different cultural traditions about things like authorship, ‘good journals’, etc.. Things to put in a collaborative agreement may include:

  • Rules for who does/does not get authorship and how authorship order is determined.

  • A list of expected papers to be written, with lead authors designated in advance, if possible. Collaborations are fluid and may change over time, so regular updating of expected papers is a good idea.

  • Target journals - your interdisciplinary collaborators may have never heard of your favorite journal, and may not gain career benefits from publishing there (e.g., will a physicist’s tenure committee say “The American Naturalist, what is that, some kind of naturist newsletter?”)

  • Expectations for data storage as it is generated - is it shared with everyone on a server as the data are produced, or only when complete?

  • Data archiving

  • What are the procedures for conflict resolution (whether intellectual or interpersonal), who adjudicates disagreements about authorship, interpretation of results, or even harassment

  • What if someone does not deliver on their task in a timely manner - at what point (and how) are they to be replaced by a new collaborator who will deliver?

  • Who can use what data, once it is generated?

  • Who is writing what grants and when

  • Code of conduct, field safety, lab safety,

  • When a manuscript is almost ready, does everyone need to sign off to approve submission (yes, good idea), and what happens if someone takes months to do so, or refuses to?

Language barriers: In any collaboration, it is important to clearly and precisely explain terminology. Language is the means of communicating the information and lack of precision can lead to confusion. When you try to cross intellectual boundaries, whether they be across biomes or disciplines, language is critical. In a paper about the origin of the alternation of generations, David Haig elegantly described this challenge: “vocabulary … can be deceptively familiar: familiar because we use many of the same terms; deceptive because these terms are used with different connotations, arising from different conceptual and theoretical assumptions”. Striving for useful and precise definitions is not only essential for research, but also successful collaborative projects. It is a feature, not a bug, of collaborative science. Often, the most novel and innovative aspects of a collaboration lie in the space where our assumptions differ.

As an example, the ‘Selection across the life cycle’ group from the 2019 Evolution in Changing Seas workshop encountered barriers to easy communication from the get-go. What did we even mean by a ‘complex life cycle’? A life cycle? A life history? Were life cycles and life histories merely synonyms? Were we tying ourselves up in knots over a semantic argument? We couldn’t make sufficient headway in our broader questions about selection across the life cycle if we didn’t all agree on what a life cycle was, let alone a complex one. We discussed what ultimately became Box 1 in Albecker et al. (2021) while on Shoals and in subsequent Zoom meetings. This certainly helped Stacy argue the point that life cycles and life history traits are not the same thing more broadly beyond the scope of our working group! And, these discussions were invigorating if not sometimes frustrating when you are limited by words that inadequately describe what you think makes sense in your own head.

Semantic confusion abounds. Even evolutionary ecologists, of which most at the recent workshop on Shoals identify as, one word may mean many things. Semantic stress becomes even more acute when you cross into entirely new disciplines. For example, biologists speak entirely different languages to computer scientists - so how do you learn to communicate? For instance, Dan’s Rules of Life award has him collaborating with Tina Eliassi-Rad at Northeastern. Tina hasn’t had a biology class since high school, and Dan’s never had a computer science class nor a class in network science. Starting the collaboration included Dan having to learn a lot of basic network science terms, and in many cases this involves biologists and computer scientists using different words for the same thing. Think about a gene expression database with individual animals as rows, different genes as columns. Dan might use terms like “individuals” or “replicates” for the rows, and “genes” or “dependent variables” as the columns, plus “independent variables” in the metadata, but a computer scientist might use “instances” and “features”. It has taken some practice getting comfortable with each other’s terminology. Meanwhile Tina asked for some basic introduction to genetics and RNA.

Not only is common terminology essential for collaborative research, but so is a common framework. As another example from the Evolution in Changing Seas RCN, the ‘Connectivity’ working group Alli is a part of  had to tackle different frameworks about how populations are connected. In this group, much of the language was the same but even when terminology was reconciled fundamental differences between how evolutionary, genetic-focused researchers saw populations vs how spatial, ecology focused researchers saw them required us to develop a new temporal framework to effectively communicate. By having a guiding structure for how our points of view connected with one another we were able to make progress on our larger goal of using data to test connectivity assumptions. In this case the confusion arose in closely related fields of ecology and evolution. In interdisciplinary collaborations, where intellectual frameworks differ almost by definition, taking the time to develop a clear theoretical outline for hypotheses helps develop research questions and facilitates group member communication. 

Sometimes in collaborations conflicts can arise. Often, not surprisingly, conflicts center around terminology, while shared frameworks are being developed. It helps for all team members to expect some tension and have some expectations for how to deal with it. When developing collaboration agreements it helps to set expectations for a minimum level of participation. Setting this clear from the get-go lets group members play on the same level. When conflicts arise between group members, it helps to focus on the group as a whole, rather than the individual members. Some phrases to use might be things like “how does our group feel about [blank definition]?” or “Is this something the project can resolve, or is this a larger issue within your disciplines?” These kinds of questions can refocus group participants. These questions are also useful to ask yourself, if you ever find yourself planting your flag on a semantic hill. (Of course, once you have developed your common vocabulary, it is important to explain it in your publications – remember your confusions and conflicts, for they are what you must explain later and will often be some of your most valuable contributions.)

Off to the races... or off the rails 

Interdisciplinary collaboration comes with a series of logistical challenges and considerations. For example, money is ALWAYS an issue. We need money for travel, meetings, and salaries. While COVID might have necessitated comfort with virtual meetings, there isn’t really a replacement for being in the same place at the same time.


Like any relationship, collaborations take time and work to be successful. The biggest factor is that everyone involved meets regularly to stay engaged. The more time you commit, the more connected you are intellectually and the more you feel you need to succeed in generating a product. So, frequent meetings are crucial, weekly or monthly. These will likely usually occur via zoom, if you are in far-flung places. In the case of Dan’s Rules of Life collaboration, this hinges on a weekly reading club about network analysis of biological data, reading papers from biology and CS and statistics. This zoom meeting began with just the core team but grew to include more people as we found it helpful to bring more conversationalists to the table. We preserve one week a month for project-specific discussion.

The weekly meetings should be supplemented by periodic (once or twice a year) in-person meetings. These help develop personal connections and friendships that cement the motivation to help each other. They make you focused for a few solid days just on the group’s goals and progress, which is often more efficient than the hour-per-week that is easily missed or forgotten. And there are huge benefits from the creative and more free-wheeling conversations that may happen after the official work day ends. These meetings are especially valuable to students and postdocs; it will provide them more of a network to rely on in the future and people to seek help from in their current work.

Pay people for work they do. Academics are notorious for unpaid labor because it is culturally expected and because it yields later benefits in terms of promotion. We often see scientific publications as a reward in and of themselves, because this is the currency that builds our reputation and career to advance to the next step of the academic ladder. But, this is a position of privilege to be able to devote that time and whenever feasible it is best to pay people for work done. This can be challenging though because the collaboration may not have funding or barely enough to just collect the data. Moreover, expectations for pay vary by fields: computer scientists with extensive options in the private sector tend to be better paid than many other scientists (though ecology and evolution students are waking up to the fact that their statistical programming skills are marketable outside academia).

Fields may differ also in expectations for data sharing and archiving. Some fields require public data repository archives to publish, others view these as suspect, a way for ‘data parasites’ to pirate information and scoop authors of hard-won results (note, this is a very very rare thing; far more common is the lab or agency with too much data who would be happy to help someone do more with their collected data, especially if given appropriate credit).

Exiting a bad collaboration

Sometimes collaborations fizzle out. Maybe a grant is declined or a research direction fails to yield results that keep members of the collaboration invigorated and enthusiastic. Other times, collaborations may sour. There is no easy way of extricating oneself from a bad collaboration. Walking away may not be a choice for an early career scientist, but do you stay or do you go? While hindsight is 20/20, having an agreement about data sharing, the scope of work, and the collaboration philosophy may prevent problems arising or provide a mechanism to deal with problems. 

So, what if a collaboration really does go deeply wrong and you realize this isn’t working, then what? How do you extricate yourself? If this is simply a matter of your losing interest, then we’d say you should bite the bullet and deliver on the things you promised to your collaborators or find someone to replace yourself who will be faster and more eager. But sometimes you start a collaboration and then discover that someone isn’t contributing their share. Or, they are too unpleasant. Or there’s sexual or racial harassment, or bullying. There’s no single answer here, as this depends on such a delicate balance of the specifics of the severity and nature of the problems, the benefits to persisting in the collaboration, and the career risks of pulling out (e.g., funding lost, people in power offended, etc). But, if you find yourself in a troubled collaboration, first talk to someone you trust for advice. If you decide you’d rather walk away from the collaboration, the best thing to do is to (1) do it sooner rather than later, (2) be firm but polite, (3) offer a solution to the folks left behind. For instance, finding someone in your field with similar expertise who is willing and able to put up with whatever drove you out, so you don’t leave a gap in the team you exit. This helps reduce any resentment that might ricochet back to you later. Then you need to negotiate whether you have done enough to retain some authorship later, and whether you want it, and how to financially extricate yourself.

Who publishes this stuff anyway? 

One of the challenges with interdisciplinary collaborations is identifying useful products of any project. For example, even if all participants are academics the meaningful and useful journals for group members in which to publish may be different. The publishing and peer review norms may be at odds - in mathematics, for example, proofs are often published on personal blogs, or in small notes. There is no formal peer review for these projects and citing this work in journals can be difficult. Yet the longer timeline of evolution and ecology fields can be stifling for many collaborators who are used to faster turn-around. To a computer scientist used to writing a paper a month with new software, the time-line for field work and RNA extraction and transcriptome sequencing and data filtering and read calling may seem puzzlingly slow. 

Selecting the correct journal should be discussed early within collaborations, and function as a dialogue as the project progresses. Since collaborations take time, professional goals may change and the “best” journal at the end may not be the initial journal selected. Of course, one hope is that by doing interdisciplinary work, one generates more innovative scientific results that are publishable in interdisciplinary journals that transcend any one field, which can be good for everyone (if successful).

If collaborators are outside of the academy, mutually beneficial collaborations may require creating more than one product. For many agency collaborators products like publicly available data sets, or data papers, can be a good alternative to traditional academic papers. Many agencies have specific rules on journals and data repositories, so following these procedures for data papers and data sets gives agency collaborators more control over their output. In a recent project, KelpRes, in addition to traditional peer-reviewed papers, the team also made an infographic documenting subtidal kelp forest habitat in Ireland. The types of products at one’s disposal are somewhat limitless and collaborations that go beyond the academy may open your mind to other ways of science communication.    


Scope of work template

This serves as a document to resolve any future disagreements. The more detail you fill in concerning methods and data collected, the better we can ensure non-redundant / competing work.

Newcomers to the project should fill in a SoW for their proposed contribution, which should be reconciled with prior SoWs.

If you are doing more than one distinct project that require different samples, file separate SoWs for each question/topic.

Replace red text below with your own information.

Some collaborative ethics comments: 

Carving off more than you can chew is uncool; other members of the group may be counting on you to complete your SoW to get data to interpret their own work.  So please propose to do only that which you honestly think you can complete. 

That said, it is okay to distinguish between what you definitely will do (a commitment to other team members), versus what you aspire to do. The latter may depend on funding and time availability. But, the aspirational part is just hypothetical and so your prior claim to that work is less strongly established than the core part of your SoW. If it becomes clear you are unable to follow through (e.g., a grant isn’t funded), it behooves you to tell collaborators in case someone else can do it or can help you with funding.

Scope of Work




1. Question / Hypothesis:


3. Specific Aim # 1: 

i) Rationale/Goals BRIEF SUMMARY; Distinguish between what you are committing to do, versus what you aspire to do.

ii) Methods

Fish Sampling summary

# fish per sample

Which populations?

Sample frequency (#/yr)

Sample duration (# years)

Lethal sampling?

Preservation method

Traits measured

iii) Analysis/Interpretation BRIEF

iv) Publication plan

  1. What/when

  2. Authorship:

  3. Additional data required from others:

  4. Must wait on other papers’ prior publication?

  5. Must be published before other papers?

v) Ballpark budget.  Note, enumerating these can help identify redundancy/efficiency and also forces us each to be more realistic about what we can (afford) to do

Salaries to pay personnel


# months/yr

# years



Total salary costs:


# of trips per year

# of years

# people per trip


Cost per person

Total travel costs:

Field supplies



Cost each

Total field costs:

Equipment needed



Cost each

Total equipment cost

Laboratory costs


# samples per year

# samples total

Cost per sample 

Cost per year

Cost total

Total Lab cost

vi) Existing funds:

vii) Plan to acquire funds


Publication and data sharing agreement (example)

1. Preface  

We are on this project together because we know and trust and like each other. That mutual respect is the core of the collaboration and is essential to the project’s success. In light of this mutual trust, a set of written agreements may feel awkward and overly formal. It may appear to imply a lack of trust. We think of these agreements more as a form of open and explicit communication, where we lay out our respective aims, aspirations, rights and responsibilities clearly to each other. In so doing we help avoid miscommunications and misunderstandings that can be the downfall of large collaborative projects.



2.1 Fundamental Commitment of participating researchers: Our project is an inclusive network of researchers that agree to these basic ground rules: (1) to collect data accurately and in a timely fashion, (2) to ensure the data are collected as laid out in the Project’s protocols and Scopes of Work, (3) to openly share data associated with the project, and (4) to publish high-quality collaborative papers.

2.2 Administration: The project is currently administered by a committee chaired by ____ with _____ (___ team representative), ____ (_____ team representatives), and _ (____ team representative), with input from all PIs with approved Scopes of Work.

3.      Code of Conduct

3.1 Data collection: All participants agree to collect data accurately, following WGEE protocols to ensure that resulting data can be merged among labs.  If you wish to collect additional data that is fine, please add that to your Scope of Work (see below) and a clear protocol. If you wish to suggest changes or improvements please feel free to contact the project coordinators. Data should be collected in a timely manner so that other researchers relying on your data products can proceed with their own work.

3.2 Data storage. Data collected for the project will be archived as hard copy paper or as original files (e.g., photographs) to ensure against file corruption. Data files will be stored in backed-up and version-controlled systems such as GitHub, and shared with group members. Data storage will follow best practices (e.g., https://library.stanford.edu/blogs/stanford-libraries-blog/2020/04/ten-tips-better-data-while-you-shelter-place).

3.3 Data use: Data will be available to registered project members once reviewed and uploaded to the archive by the project managers.  Data will be made publicly available upon publication. Requests for access to data prior to their public release will be reviewed by the coordinators and the relevant data collectors to ensure the proposed use does not conflict with ongoing analyses, publications, or proposals as laid out in Scopes of Work (below).

3.4. Data citation: Any participant is free to use WGEE data for publications, courses, presentation, etc. if they follow guidelines for co-authorship listed below, and if their publication does not undermine another participant’s publication goals as laid out in the Scopes of Work.

4.  Scopes of Work

4.1 The goal of the Scope of Work is to clearly define what each person’s scientific and logistical role in the project is. This is a document where you indicate what samples you will collect, what data you intend to generate, and what paper(s) you intend to publish, and how these goals will intersect with other people. It is wise to include specific biological questions, and specific publication plans where feasible.  This description, once reviewed by the rest of the group, is a time-stamped reflection of your plans, which both represents a set of rights (nobody else should publish the papers you plan on doing), and responsibilities ( you should to the best of your ability follow through on your SoW). 

It is important that you keep this reasonably achievable; if any one person claims more than they are able to deliver then this creates a gap that undercuts the overall team. If you fail to deliver on a dataset you planned to generate, after a reasonable time (TBD) it is possible for the group as a whole to consider assigning this to another participant. This will only be done in exceptional circumstances, when the group determines that the data are necessary for the collective goal, and you have not made a good faith effort to make sufficient and timely progress towards generating the data. 

4.2 You may revise the SoW as project goals evolve, but please do so using track changes / comments at first, so relevant team members can review and comment on your updates to the SoW. When you make changes, please notify the overall group and save the original to an archive folder. Team members will be asked to update their scopes of work at least annually, or as changes arise.

4.3 New team members will be expected to submit SoWs of their own, which need to be approved by the group, with a particular eye to ensure we are not generating excessive overlap with existing SoW plans by someone else on the team. The exception is if someone with an existing aim is willing to remove that from their own SoW.

4.4 The SoWs are kept here: 

These include:

PI Topic with link to SoW

5. Data sharing and publication standards


Data Sharing and Publication Agreement: Many of our projects are going to yield papers that use multiple lines of data generated by different labs. The goal of a Data Sharing & Publication Agreement is to formally define expectations for who can publish what data, when, and who has a right to authorship. These are sometimes used by Editors to adjudicate publication disagreements, so having one worked out and digitally signed in advance, and archived, can prove valuable later, though we all hope this proves unnecessary.  Participation in WGEE and use of WGEE data implies a willingness to abide by the following norms.

5.1 Co-authorship of articles using unpublished data: If you contribute data to the WGEE Project, you will automatically be included as a co-author on any papers using your data prior until such time as the data are made publicly available. We encourage true collaboration – authors should actively seek engagement from those who collected data they are using from the start of the project, and data collectors should provide feedback and insights. Final authorship assignment is the responsibility of the lead author. Of course you may opt-out of co-authorship at any time by informing the lead author.

Authorship order: The first author will typically be the researcher who has done the most to analyze the data and write the text of the manuscript. The last authors will typically be PIs who have been involved in conceiving and overseeing the overall project. Middle authors will be people who contributed minority shares of time to data collection, analysis, or writing.  Within these three categories, the order will be determined by discussion in advance of submission. Disagreements will be adjudicated by the group of PIs. Where people have equal claim to authorship roles, the order may be determined by (1) randomization, or (2) a staring contest, or (3) other fair procedure agreed upon in advance by all parties.

5.2 Co-authorship of articles using publicly available data: We strongly encourage authors who re-use archived data sets to include as fully engaged collaborators the researchers who originally collected them. We believe this could greatly enhance the quality and impact of the resulting research because it draws on the insights of those immersed in a particular field.

5.3. Informing participants of intent to write an article: If you wish to write a paper using Gatekeeper data, a working title and abstract should be submitted to the Project Coordinators prior to drafting the manuscript. This will be shared with other project participants for review. Individuals interested in becoming contributing authors of the proposed paper must contact the lead author directly. As stated above, we strongly encourage active engagement of project participants.

5.5. The Scopes of Work define expected publications, and the PI responsible for a given publication topic has the right of first refusal to write the relevant paper. They may designate a student or other lab member to take on the data collection, analysis, and authorship tasks. They may waive that right for a member of another lab to take on that topic, or a closely related one.  If a manuscript is proposed and subsequently abandoned for > 6 months, interested Gatekeeper participants are encouraged to discuss with the lead author about taking over the development of the manuscript.

5.6  Each data file will have a Readme.txt file linked to it, with relevant metadata, and who collected and curated the data (“data owners”). The data owners have the right to be included as a co-author on any paper(s) using that data file. This right may be waived after them data file becomes public (e.g., after first posting on a public repository associated with a publication), but it will remain preferable to include the data owners as co-authors when the data they are responsible for represents a significant contribution to the results of a given manuscript. 

5.7  Co-authors are expected to give constructive and thorough comments on analyses, interpretation, and writing.  This is especially pertinent after a dataset is first published (e.g., for later papers, generating already-public data is a weaker claim to authorship that should be bolstered by active intellectual participation in the writing).

5.8  Co-authors must be given reasonable time to read and comment on a paper before it is submitted for review, and it should not be submitted to a journal without their consent.  If a co-author fails to respond to comments within a reasonable time frame (to be agreed upon during manuscript preparation), the remaining co-authors may (depending on the context) remove the non-contributing co-author from the authorship list. Or, if (for instance) they contributed extensively to data collection, for example, the manuscript’s author contribution statement may indicate that the person in question only contributed to data collection.

5.8 Authors on papers should be able to explain the main findings and how they were arrived at, and vouch for the results insofar as they contributed to them via (a) conceiving of the research question(s), (b) sample acquisition, (c) data generation, (d) data analysis and interpretation, (e) writing. We will use Author Contribution Statements on articles to clearly identify participant’s roles without overstating: when an author is listed as contributing to a particular task in generating a paper, they are vouching for and responsible for the accuracy of that element of the paper.

5.9 Data will not be shared with outside parties without the full group’s consent, and especially not without consent of the data owner(s).

6.  Disputes: if someone is not listed as a co-author, who believes they have a right to be co-author, may appeal to the group of PIs, or to the journal Editor in question. Participation in the project implies consent to abide by the above rules, and add co-authors as expected, but also to contribute to manuscripts to earn and retain authorship rights.

7. The entire collaborative team operates on a position of trust that (a) the other team members are respectful of their expertise, contributions, and career goals, and (b) data contributed by team members are accurate and correct.


By typing your name below, you indicate that you have read the above document and relevant Scopes of Work, and agree to abide by the terms of this document.

Name: Date read

A 25-year quest for the Holy Grail of evolutionary biology

When I started my postdoc in 1998, I think it is safe to say that the Holy Grail (or maybe Rosetta Stone) for many evolutionary biologists w...