I am copying my title from a review by David Houle in the
journal Evolution that savaged the 1998
book Asymmetry, Developmental Stability,
and Evolution. Among many criticisms in the review was the general point that
a lot of attention was being directed toward research on asymmetry when, in
reality, it explained relatively little of the variation in
ecological and evolutionary processes. The goal of this post is to suggest some
other research areas of high enthusiasm whose low explanatory power suggests
they are perhaps over-represented and unduly-lauded by granting agencies, by weekly
scientific magazines, by the general public, and by many students.
“Negative reviews often give a frisson of pleasure to the
reader” was the line that headed the last paragraph of Houle’s review. Although
this post might be interpreted as a negative indictment of several lines of research, I will
try to be sufficiently polite that any frisson derived therefrom is only minor. The key
point to remember is that I am not suggesting these research areas are not
useful or interesting, merely that their exaggerated popularity might belie
their ultimate utility. Of course, if you actually work on these topics, please
do not think I am criticizing the value of your work.
1. The tyranny of the gene.
Much attention is directed these days toward finding the
“gene for” this or that. Indeed, papers that identify particular genes and
confirm their function tend to get a lot of attention at conferences, in the weekly publications, and (partly for these reasons) in granting panels. You can see the appeal – such work gets right down to the specific part
of the genome that is having a particular effect on the phenotype. However,
the amount of time and effort put into trying to find the “gene for” something
is often not worth the insight gained, as has recently been pointed by a number
of authors, including Rockman (2011)
and Rausher and Delph (2015).
More importantly, such efforts will typically be futile given that nearly all
adaptation is the result of many genes of modest-to-small effect. That is, the well-characterized and clearly-important genes (e.g., EDA,
PITX1, MC1R, and so on) are actually exceptions, the focus on which can detract us from
the vast majority of the variance that needs to be explained. I have previously
summarized some of my main arguments on this topic (Hendry
2013) and I summarize them briefly here.
a) Current genomic methods are strongly biased toward the
detection of large-effect genes. In particular, investigators tend to search around for
some gene that is explaining a lot of something and then they focus subsequent
efforts in that direction. The reader of published work on these genes often doesn’t realize that
the incredibly strong ascertainment bias means that the elegant examples are
really exceptions to the general rule that genes of large effect are very rare.
b) Even
large-effect genes usually explain only a small fraction of the variation in a
trait. Sure some of those genes (e.g., EDA) do explain a lot of the variation but
most other “large effect” genes explain much less – often less than 15%. That
is, the majority of trait variation typically remains unexplained even by
large-effect genes.
Just one example of how genes of large effect explain less than half the variance in traits. This figure is from Hendry (2013) based on data from Rogers et al. (2012). |
c) Most
studies focus on traits rather than adaptation per se. Yet adaptation to a given
environment is the result of many traits, each of which is potentially
influenced by many genes. Indeed, objective genome scans often recover hundreds
to thousands of genes or genomic regions contributing to adaptation in different environments. As a result, the variation in ADAPTATION (fitness)
explained by any given gene – even the large-effect ones noted above – is typically vanishingly small.
(Of course if someone
is willing to search for – and, better yet find – some genes of massive effect
in my study populations, I promise to sing a different song as we published
your findings in Nature/Science.)
2. Parallel evolution is
not very parallel.
A very large number of papers claim to demonstrate parallel
evolution at the phenotypic level – that is, similar phenotypes evolve in
similar habitats. I am continually irked, however, by the fact that such
inferences often stem simply from a statistically-significant main effect for the “habitat” term (lake vs. stream, dry vs. wet,
high-elevation vs. low-elevation, high-predation vs. low-predation, cave vs.
surface, etc., etc.) in a statistical model. However, when one looks closely at the population means,
one almost invariably finds that some populations in a given habitat diverge in the
opposite direction. This means that, in most cases, evolution is a mix of both
parallel and non-parallel components. We (Krista Oke, Caroline Leblond, and
myself) have performed a meta-analysis of more than 100 papers focusing on
fishes and found that the amount of variation explained by the habitat term ranges
from very low to very high, which means that the parallelness of evolution is
equally variable. Instead of testing for – and asserting – parallel evolution,
authors should QUANTIFY precisely how parallel and non-parallel different aspects
of evolution are. Parallel evolution isn’t an either-or situation, it is
instead a continuum, and we need to know where a given system falls on that
continuum. Moreover, I would argue that the non-parallel aspects of evolution
are often more interesting than the parallel aspects, because we are poised to learn something new rather than just confirming something
we already suspected.
3. Animal personalities
and behavioral syndromes.
Behavioral ecology has recently been seduced by the
idea that individual animals have “personalities” manifest as correlated
behavior across contexts/situations, such as being bold (as opposed to shy)
in the presence of both predators and mates. A related idea is that behavioral
traits are often correlated with each other, such that some
individuals are bold/social whereas others are shy/non-social. These twin concepts
have swept through the field and now seem to foremost in the minds of many
students. My reading of the literature, however, is that the variance explained
by these phenomena is typically very low. For instance, investigators that find
a significant repeatability of individual behavior through time (or across
contexts) will often conclude that this variation indicates personality.
However, these significant p values are often associated with very small
effect sizes. In fact, in nearly all cases, data on the behavior of a set of
individuals at one time point (or context) would allow you to predict less than
half of the variation in behavior for those same individuals at another time
point (or context). I am not saying that personalities and syndromes are not
real or interesting, merely that their near-monopoly on modern behavioral ecology
is not necessarily deserved given their limited explanatory power.
(Of course, I happen
to be engaged in several exciting studies of animal personalities/syndromes and their importance
in natural populations.)
4. Biodiversity and
ecosystem function.
Hugely important and influential in ecology is the relationship between
biodiversity (species diversity, phylogenetic diversity, functional diversity)
and ecosystem function (e.g., productivity, nutrient cycling, decomposition,
etc.). These relationships are important because they help to provide a
justification for preserving biodiversity per se given that they suggest "more
is better" independent of the specific species. (Otherwise we could simply
assemble our ideal set of species that fill the roles we need and forget about
the rest.) Partly due to the applied nature of this inference, the defense of
biodiversity and its preservation has shifted in many circles to arguments
surrounding its role in “ecosystem services.” And yet the variance in ecosystem function explained by biodiversity is
often quite small, with a typical value being about 20%.
This means that, while biodiversity does often correlate with ecosystem function,
something else must be much more important. This realization means that arguments for
the preservation of biodiversity can’t devolve simply to arguments about
ecosystem function – they must instead continue to tout the many other benefits
of biodiversity.
Some typically messy biodiversity-ecosystem function relationships (from Loreau et al. 2001). |
Moreover
In addition to all of the above, the effect size (e.g.,
r-squared) for a given term in a given statistical model will often be overestimated.
First, even two variables with no true correlation will return a non-zero r-square simply by chance – and the magnitude of this bias will increase with decreasing sample size. As a result, estimates of weak effects are inflated by error (for
an illustration see Jiang et al. 2013). Second, and
related to the first point, most analyses fail to account for error in the estimation
of the individual data points that make up the regression (that is, they do not
appropriately propagate the error). As a result, confidence in the estimated
r-squared is greater than it should be (e.g., Morrissey and Hadfield 2012).
Third, correlation does not equal causation and so the causal part of a given
correlation might be much lower than the estimated effect size. See my earlier post about “Faith's Conjecture", which states that “any
correlation from which a causal relationship might be inferred (the thing on
the x axis influences the thing on the y axis) can be inverted (the things on
the x and y are switched) to lead to a new causal inference”.
Fourth, many studies show that effect sizes decrease through time as studies
are replicated, which is called – among other things – “regression toward
the mean”.
On a positive
note.
This post might be interpreted in the depressing sense that
we can’t explain much in ecology or evolution and that the cherished
relationships on which we focus so intensively and that we tout so loudly are really a
waste of time. However, I would instead like to end focus with a more positive
message.
In general, it seems that the variance explained by a given factor in ecology is often quite small – about 2.5-5.4% - as revealed by Moller and Jennion’s (2002) meta-analysis of meta-analyses. Peck et al. (2003) later argued that the more important question was “How much variation is NOT explained” and came up with an answer of “roughly half.” Together, these two analyses suggest that ecological and evolutionary variation is multifactorial and that, if we are to explain much of the variation, we need to look beyond single causation. Stated another way, we shouldn’t focus so much effort on single explanatory factors that explain relatively little of the total variance but we should instead embrace multi-factorial causality that can explain much more.
From Moller and Jennions (2002). |
Reflecting on the above four research areas for which proponents show high enthusiasm but the data indicates low explanatory power, I am reminded of David Queller’s (1995) response to Gould and Lewontin’s anti-adaptationist rhetoric: “That’s well said, but let’s get back to our field work.”
[We have a comment-length limitation, so I'm going to post this in paragraphs.] Hi Andrew. You write: "More importantly, such efforts will typically be futile given that nearly all adaptation is the result of many genes of modest-to-small effect." I find it difficult to really pin down exactly what this means. I have a genetic make-up that allows me to live on land, walk, and breathe air. That adaptation is governed by a great many specific alleles; indeed, it would probably be true to say that most or all of my genome is devoted to being well-adapted to that niche. Are those alleles of large or small effect? If I had alleles that adapted me to a life of swimming in the ocean instead – as my fishy ancestors once did – I would die if I were placed on dry land. Give me any sizeable subset of fishy traits, and I would also die – substitute gills in place of my lungs, while keeping everything else about me human, for example, and I still have a fitness of zero on dry land. Likewise if you keep me human but give me the eyes of a fish, or give me the tail of a fish instead of legs, or any of a huge number of other possible modifications. Even when you get down to the level of individual genes, probably many of them would be of "large effect" in that sense; substituting the ancestral allele for many of my genes would probably have a lethal, or at least highly detrimental, effect. Studies show that about 15% of gene knockouts in mice are developmentally lethal; many more have large detrimental effects. To the extent that fishy alleles are so different from human alleles that they would approximate knockout in many cases, the fishy substitutions would often be very negative (and if fishy alleles aren't extreme enough, I can just extend the thought experiment further, back to the first vertebrate, or the first animal, or the first eukaryote). So most of the genes in my genome appear to be of "large effect" in my adaptation to my environment, in this sense. Where is this argument wrong?
ReplyDeleteWell, it seems like when people talk about genes being or "small" or "large" effect they really mean the effect size on a "trait". Fur color, for example, might be influenced by many genes, each of which changes the fur color only a little, or it might be influenced by one or a few genes, each of which changes the fur color a lot. But here, too, the terms of the argument seem too vague to really mean anything. What is a "trait"? Any phenotypic characteristic could be considered to be a "trait", and any given trait can be subdivided into sub-traits, until you get down to the level of the individual protein products of individual genes. At the level of individual protein products of genes, genes are obviously of "large effect", by definition; if you knock out a gene, its protein product is no longer produced. As you move up the level of biochemical complexity, individual genes will seem to have smaller and smaller effects. So whether "traits" are generally influenced by few genes of large effect or many genes of small effect would seem to depend mostly on what sort of "traits" you choose to look at and how close to the level of individual protein products you are. The choice of what "traits" are interesting is, it seems to me, entirely subjective; nothing in nature tells us whether the right "trait" to think about is overall height, or leg length, or femur length, or bone growth rate in a particular zone of the femur, or level of HGH in the blood, or any of the other levels of subdivision of related "traits" one could consider. Depending on which level you choose, you will find that genes have a larger or smaller effect size. Some "traits" that are very close to the level of the individual protein products of genes nevertheless happen to be very macroscopically visible – fur color is an example where people often say that it has been shown to be governed by few genes of large effect (at least in the systems where the genetics have been explored, such as mice). But who says that "fur color" is the real "trait"? Perhaps the real trait is the production of a specific type of melanin in a specific type of cell; if so, the trait will be governed by even fewer genes of even larger effect, no? Or perhaps the real trait is "crypsis", of which fur color is only one small component; other components would be habitat choice, hiding behavior, nocturnality vs. diurnality, etc. If you consider crypsis to be the "trait" then suddenly that large-effect gene on fur color is a small-effect gene on crypsis. So what? Since the choice of what trait to focus on is subjective, the argument over small or large effect seems meaningless. The whole thing boils down to the semantics of what one means by the word "trait" and what level of biochemical complexity you focus on.
ReplyDeleteThis line of argument is often side-stepped by talking about the effect size of genes on overall fitness, rather than on individual "traits". But I don't think that actually helps very much, because then the "effect size" depends entirely on what environments you are discussing and what alleles you are discussing. Let's go back to my example above of me as a human versus me with single genes changed back to the fishy alleles of my ancestral lineage. Changing individual genes in that way would often appear to be of large effect, as I argued above. Why? For two reasons. (1) The difference between living on land versus living in the ocean is large. The larger the environmental contrast, the larger the effect size will appear to be for alleles that affect adaptation to one environment versus the other. (2) The difference between the human allele and the ancestral fishy allele will often be large. The larger the functional difference between the proteins coded by different alleles, the larger the effect size will appear to be – but that is an effect of the particular alleles you are comparing, not of the gene. For both of these reasons, the human-vs-fish comparison makes it seems that many genes have a very large effect; you could do the same comparison for human-vs-chimp and you would then think that the same genes were often of small effect. You're not really saying anything about the genes per se; you're saying something about the degree of environmental contrast that you're examining, and about the degree of functional difference between the allelic variants that you happen to be comparing.
ReplyDeleteOK, this rant has gone on long enough, and I think I have still only scratched the surface of what I'm trying to say. This is really a topic for a whole evening of discussion over beer – or perhaps a whole symposium or conference or book. I nevertheless hope that the above is not complete gibberish. Suffice to say that I'd really like to see a discussion of the effect sizes of genes that did proper justice to the definitional and semantic aspects of the question. Without that, the two sides will continue to just talk past each other, I think, and both sides will continue to be both right and wrong, depending on one's point of view.
ReplyDeleteThanks for an interesting post.
ReplyDeleteFor point 4, you write that "... the variance in ecosystem function explained by biodiversity is often quite small, with a typical value being about 20%."
This value might well be true, and it would be interesting to know where the 20% figure comes from? Is it from some formal meta-analyses, or is it more qualitative-conclusion-based?
Best wishes,
Lars
Thanks for these comments Ben - I look forward to reply more in the near future.
ReplyDeleteGreat post, Andrew. I agree with pretty much all of this. I have the great fortune to have published on only one of your examples: fluctuating asymmetry, and only once. I will claim (and of course you will believe me) that I dropped that because I cleverly realized the truth of the argument about it :-)
ReplyDeleteI think what these things have in common is that they are such appealing ideas that we think (at least, I think) they just ought to be true. And they probably are; but empirically, that doesn't necessarily make them important.