Tuesday, August 25, 2015

High enthusiasm and low r-squared.

I am copying my title from a review by David Houle in the journal Evolution that savaged the 1998 book Asymmetry, Developmental Stability, and Evolution. Among many criticisms in the review was the general point that a lot of attention was being directed toward research on asymmetry when, in reality, it explained relatively little of the variation in ecological and evolutionary processes. The goal of this post is to suggest some other research areas of high enthusiasm whose low explanatory power suggests they are perhaps over-represented and unduly-lauded by granting agencies, by weekly scientific magazines, by the general public, and by many students.

“Negative reviews often give a frisson of pleasure to the reader” was the line that headed the last paragraph of Houle’s review. Although this post might be interpreted as a negative indictment of several lines of research, I will try to be sufficiently polite that any frisson derived therefrom is only minor. The key point to remember is that I am not suggesting these research areas are not useful or interesting, merely that their exaggerated popularity might belie their ultimate utility. Of course, if you actually work on these topics, please do not think I am criticizing the value of your work.

1. The tyranny of the gene.

Much attention is directed these days toward finding the “gene for” this or that. Indeed, papers that identify particular genes and confirm their function tend to get a lot of attention at conferences, in the weekly publications, and (partly for these reasons) in granting panels. You can see the appeal – such work gets right down to the specific part of the genome that is having a particular effect on the phenotype. However, the amount of time and effort put into trying to find the “gene for” something is often not worth the insight gained, as has recently been pointed by a number of authors, including Rockman (2011) and Rausher and Delph (2015). More importantly, such efforts will typically be futile given that nearly all adaptation is the result of many genes of modest-to-small effect. That is, the well-characterized and clearly-important genes (e.g., EDA, PITX1, MC1R, and so on) are actually exceptions, the focus on which can detract us from the vast majority of the variance that needs to be explained. I have previously summarized some of my main arguments on this topic (Hendry 2013) and I summarize them briefly here.

a) Current genomic methods are strongly biased toward the detection of large-effect genes. In particular, investigators tend to search around for some gene that is explaining a lot of something and then they focus subsequent efforts in that direction. The reader of published work on these genes often doesn’t realize that the incredibly strong ascertainment bias means that the elegant examples are really exceptions to the general rule that genes of large effect are very rare.  

b) Even large-effect genes usually explain only a small fraction of the variation in a trait. Sure some of those genes (e.g., EDA) do explain a lot of the variation but most other “large effect” genes explain much less – often less than 15%. That is, the majority of trait variation typically remains unexplained even by large-effect genes.

Just one example of how genes of large effect explain less than half the variance in traits. This figure is from Hendry (2013) based on data from Rogers et al. (2012).
c) Most studies focus on traits rather than adaptation per se. Yet adaptation to a given environment is the result of many traits, each of which is potentially influenced by many genes. Indeed, objective genome scans often recover hundreds to thousands of genes or genomic regions contributing to adaptation in different environments. As a result, the variation in ADAPTATION (fitness) explained by any given gene – even the large-effect ones noted above – is typically vanishingly small.

(Of course if someone is willing to search for – and, better yet find – some genes of massive effect in my study populations, I promise to sing a different song as we published your findings in Nature/Science.)

2. Parallel evolution is not very parallel.

A very large number of papers claim to demonstrate parallel evolution at the phenotypic level – that is, similar phenotypes evolve in similar habitats. I am continually irked, however, by the fact that such inferences often stem simply from a statistically-significant main effect for the “habitat” term (lake vs. stream, dry vs. wet, high-elevation vs. low-elevation, high-predation vs. low-predation, cave vs. surface, etc., etc.) in a statistical model. However, when one looks closely at the population means, one almost invariably finds that some populations in a given habitat diverge in the opposite direction. This means that, in most cases, evolution is a mix of both parallel and non-parallel components. We (Krista Oke, Caroline Leblond, and myself) have performed a meta-analysis of more than 100 papers focusing on fishes and found that the amount of variation explained by the habitat term ranges from very low to very high, which means that the parallelness of evolution is equally variable. Instead of testing for – and asserting – parallel evolution, authors should QUANTIFY precisely how parallel and non-parallel different aspects of evolution are. Parallel evolution isn’t an either-or situation, it is instead a continuum, and we need to know where a given system falls on that continuum. Moreover, I would argue that the non-parallel aspects of evolution are often more interesting than the parallel aspects, because we are poised to learn something new rather than just confirming something we already suspected.

Studies of parallel evolution in fishes ranges from near-perfect parallelism (near 1) to near perfect non-parallelism (near 2). These data are from an early version of the analysis conducted by Oke, Leblond, and Hendry. For details, contact A Hendry.
3. Animal personalities and behavioral syndromes.

Behavioral ecology has recently been seduced by the idea that individual animals have “personalities” manifest as correlated behavior across contexts/situations, such as being bold (as opposed to shy) in the presence of both predators and mates. A related idea is that behavioral traits are often correlated with each other, such that some individuals are bold/social whereas others are shy/non-social. These twin concepts have swept through the field and now seem to foremost in the minds of many students. My reading of the literature, however, is that the variance explained by these phenomena is typically very low. For instance, investigators that find a significant repeatability of individual behavior through time (or across contexts) will often conclude that this variation indicates personality. However, these significant p values are often associated with very small effect sizes. In fact, in nearly all cases, data on the behavior of a set of individuals at one time point (or context) would allow you to predict less than half of the variation in behavior for those same individuals at another time point (or context). I am not saying that personalities and syndromes are not real or interesting, merely that their near-monopoly on modern behavioral ecology is not necessarily deserved given their limited explanatory power.

Any value less than 0.7 means that less than half of the variance in the behavior at one time can be explained by variance in behavior at another time. From Hendry (2015) based on data from Bell et al. (2009). 
(Of course, I happen to be engaged in several exciting studies of animal personalities/syndromes and their importance in natural populations.)

4. Biodiversity and ecosystem function.

Hugely important and influential in ecology is the relationship between biodiversity (species diversity, phylogenetic diversity, functional diversity) and ecosystem function (e.g., productivity, nutrient cycling, decomposition, etc.). These relationships are important because they help to provide a justification for preserving biodiversity per se given that they suggest "more is better" independent of the specific species. (Otherwise we could simply assemble our ideal set of species that fill the roles we need and forget about the rest.) Partly due to the applied nature of this inference, the defense of biodiversity and its preservation has shifted in many circles to arguments surrounding its role in “ecosystem services.” And yet the variance in ecosystem function explained by biodiversity is often quite small, with a typical value being about 20%. This means that, while biodiversity does often correlate with ecosystem function, something else must be much more important. This realization means that arguments for the preservation of biodiversity can’t devolve simply to arguments about ecosystem function – they must instead continue to tout the many other benefits of biodiversity.

Some typically messy biodiversity-ecosystem function relationships (from Loreau et al. 2001).


In addition to all of the above, the effect size (e.g., r-squared) for a given term in a given statistical model will often be overestimated. First, even two variables with no true correlation will return a non-zero r-square simply by chance – and the magnitude of this bias will increase with decreasing sample size. As a result, estimates of weak effects are inflated by error (for an illustration see Jiang et al. 2013). Second, and related to the first point, most analyses fail to account for error in the estimation of the individual data points that make up the regression (that is, they do not appropriately propagate the error). As a result, confidence in the estimated r-squared is greater than it should be (e.g., Morrissey and Hadfield 2012). Third, correlation does not equal causation and so the causal part of a given correlation might be much lower than the estimated effect size. See my earlier post about “Faith's Conjecture", which states that “any correlation from which a causal relationship might be inferred (the thing on the x axis influences the thing on the y axis) can be inverted (the things on the x and y are switched) to lead to a new causal inference”. Fourth, many studies show that effect sizes decrease through time as studies are replicated, which is called – among other things – “regression toward the mean”.

On a positive note.

This post might be interpreted in the depressing sense that we can’t explain much in ecology or evolution and that the cherished relationships on which we focus so intensively and that we tout so loudly are really a waste of time. However, I would instead like to end focus with a more positive message.

In general, it seems that the variance explained by a given factor in ecology is often quite small – about 2.5-5.4% - as revealed by Moller and Jennion’s (2002) meta-analysis of meta-analyses. Peck et al. (2003) later argued that the more important question was “How much variation is NOT explained” and came up with an answer of “roughly half.” Together, these two analyses suggest that ecological and evolutionary variation is multifactorial and that, if we are to explain much of the variation, we need to look beyond single causation. Stated another way, we shouldn’t focus so much effort on single explanatory factors that explain relatively little of the total variance but we should instead embrace multi-factorial causality that can explain much more.

From Moller and Jennions (2002).

Reflecting on the above four research areas for which proponents show high enthusiasm but the data indicates low explanatory power, I am reminded of David Queller’s (1995) response to Gould and Lewontin’s anti-adaptationist rhetoric: “That’s well said, but let’s get back to our field work.” 


  1. [We have a comment-length limitation, so I'm going to post this in paragraphs.] Hi Andrew. You write: "More importantly, such efforts will typically be futile given that nearly all adaptation is the result of many genes of modest-to-small effect." I find it difficult to really pin down exactly what this means. I have a genetic make-up that allows me to live on land, walk, and breathe air. That adaptation is governed by a great many specific alleles; indeed, it would probably be true to say that most or all of my genome is devoted to being well-adapted to that niche. Are those alleles of large or small effect? If I had alleles that adapted me to a life of swimming in the ocean instead – as my fishy ancestors once did – I would die if I were placed on dry land. Give me any sizeable subset of fishy traits, and I would also die – substitute gills in place of my lungs, while keeping everything else about me human, for example, and I still have a fitness of zero on dry land. Likewise if you keep me human but give me the eyes of a fish, or give me the tail of a fish instead of legs, or any of a huge number of other possible modifications. Even when you get down to the level of individual genes, probably many of them would be of "large effect" in that sense; substituting the ancestral allele for many of my genes would probably have a lethal, or at least highly detrimental, effect. Studies show that about 15% of gene knockouts in mice are developmentally lethal; many more have large detrimental effects. To the extent that fishy alleles are so different from human alleles that they would approximate knockout in many cases, the fishy substitutions would often be very negative (and if fishy alleles aren't extreme enough, I can just extend the thought experiment further, back to the first vertebrate, or the first animal, or the first eukaryote). So most of the genes in my genome appear to be of "large effect" in my adaptation to my environment, in this sense. Where is this argument wrong?

  2. Well, it seems like when people talk about genes being or "small" or "large" effect they really mean the effect size on a "trait". Fur color, for example, might be influenced by many genes, each of which changes the fur color only a little, or it might be influenced by one or a few genes, each of which changes the fur color a lot. But here, too, the terms of the argument seem too vague to really mean anything. What is a "trait"? Any phenotypic characteristic could be considered to be a "trait", and any given trait can be subdivided into sub-traits, until you get down to the level of the individual protein products of individual genes. At the level of individual protein products of genes, genes are obviously of "large effect", by definition; if you knock out a gene, its protein product is no longer produced. As you move up the level of biochemical complexity, individual genes will seem to have smaller and smaller effects. So whether "traits" are generally influenced by few genes of large effect or many genes of small effect would seem to depend mostly on what sort of "traits" you choose to look at and how close to the level of individual protein products you are. The choice of what "traits" are interesting is, it seems to me, entirely subjective; nothing in nature tells us whether the right "trait" to think about is overall height, or leg length, or femur length, or bone growth rate in a particular zone of the femur, or level of HGH in the blood, or any of the other levels of subdivision of related "traits" one could consider. Depending on which level you choose, you will find that genes have a larger or smaller effect size. Some "traits" that are very close to the level of the individual protein products of genes nevertheless happen to be very macroscopically visible – fur color is an example where people often say that it has been shown to be governed by few genes of large effect (at least in the systems where the genetics have been explored, such as mice). But who says that "fur color" is the real "trait"? Perhaps the real trait is the production of a specific type of melanin in a specific type of cell; if so, the trait will be governed by even fewer genes of even larger effect, no? Or perhaps the real trait is "crypsis", of which fur color is only one small component; other components would be habitat choice, hiding behavior, nocturnality vs. diurnality, etc. If you consider crypsis to be the "trait" then suddenly that large-effect gene on fur color is a small-effect gene on crypsis. So what? Since the choice of what trait to focus on is subjective, the argument over small or large effect seems meaningless. The whole thing boils down to the semantics of what one means by the word "trait" and what level of biochemical complexity you focus on.

  3. This line of argument is often side-stepped by talking about the effect size of genes on overall fitness, rather than on individual "traits". But I don't think that actually helps very much, because then the "effect size" depends entirely on what environments you are discussing and what alleles you are discussing. Let's go back to my example above of me as a human versus me with single genes changed back to the fishy alleles of my ancestral lineage. Changing individual genes in that way would often appear to be of large effect, as I argued above. Why? For two reasons. (1) The difference between living on land versus living in the ocean is large. The larger the environmental contrast, the larger the effect size will appear to be for alleles that affect adaptation to one environment versus the other. (2) The difference between the human allele and the ancestral fishy allele will often be large. The larger the functional difference between the proteins coded by different alleles, the larger the effect size will appear to be – but that is an effect of the particular alleles you are comparing, not of the gene. For both of these reasons, the human-vs-fish comparison makes it seems that many genes have a very large effect; you could do the same comparison for human-vs-chimp and you would then think that the same genes were often of small effect. You're not really saying anything about the genes per se; you're saying something about the degree of environmental contrast that you're examining, and about the degree of functional difference between the allelic variants that you happen to be comparing.

  4. OK, this rant has gone on long enough, and I think I have still only scratched the surface of what I'm trying to say. This is really a topic for a whole evening of discussion over beer – or perhaps a whole symposium or conference or book. I nevertheless hope that the above is not complete gibberish. Suffice to say that I'd really like to see a discussion of the effect sizes of genes that did proper justice to the definitional and semantic aspects of the question. Without that, the two sides will continue to just talk past each other, I think, and both sides will continue to be both right and wrong, depending on one's point of view.

  5. Thanks for an interesting post.

    For point 4, you write that "... the variance in ecosystem function explained by biodiversity is often quite small, with a typical value being about 20%."

    This value might well be true, and it would be interesting to know where the 20% figure comes from? Is it from some formal meta-analyses, or is it more qualitative-conclusion-based?

    Best wishes,


  6. Thanks for these comments Ben - I look forward to reply more in the near future.

  7. Great post, Andrew. I agree with pretty much all of this. I have the great fortune to have published on only one of your examples: fluctuating asymmetry, and only once. I will claim (and of course you will believe me) that I dropped that because I cleverly realized the truth of the argument about it :-)

    I think what these things have in common is that they are such appealing ideas that we think (at least, I think) they just ought to be true. And they probably are; but empirically, that doesn't necessarily make them important.


Sticklestock center

"There are two kinds of readers.  Those who have read the Lord of the Rings.  And those who are going to." There are two kinds of ...