Non-adaptive explanations as null hypotheses in Evolutionary Ecology
A response to "Dulling Occam's Razor - or the Perilous Principle of Parsimony"
As a general rule, the Graduate Student tries with great conviction to avoid public humiliation. The Graduate Student will spend long hours in the isolation of his laboratory, reading his Advisor's doctoral thesis with furrowed brows, diligently applying the appropriate statistical tests to partially-failed experiments, and avoiding the departmental seminar coordinator as if she had Zaire ebolavirus. Most of all, though, the Graduate Student is motivated to plod along by the terror and anticipation of the Week of Reckoning: the annual (and sometimes mandatory) sojourn to a scientific conference.
As a general rule, the Graduate Student tries with great conviction to avoid public humiliation. The Graduate Student will spend long hours in the isolation of his laboratory, reading his Advisor's doctoral thesis with furrowed brows, diligently applying the appropriate statistical tests to partially-failed experiments, and avoiding the departmental seminar coordinator as if she had Zaire ebolavirus. Most of all, though, the Graduate Student is motivated to plod along by the terror and anticipation of the Week of Reckoning: the annual (and sometimes mandatory) sojourn to a scientific conference.
As far as the Graduate Student is concerned, all kinds of disastrous things could
happen at a conference. A good example of such an eventuality is when the
Graduate Student is invited to a conservation with a Well-Respected Member of his
Field, usually by his well-meaning Advisor. Typically, the Graduate Student is
caught off-guard by this invitation, such that the invitation results in an
over-stimulation of the Graduate Student’s adrenal glands. (This is maladaptive).
During subsequent conversation, the Graduate Student is
expected to accomplish two things. The first is to make one small - but
nonetheless meaningful - contribution to the discussion. The second expectation
is that the Graduate Student will not make an Ass of himself. Both expectations
are exceedingly difficult to meet.
When is a non-adaptive explanation reasonable as null hypothesis? |
I am well-versed in this area, as I have first-hand experience. I was at a conference a number of years ago, as a Graduate Student (I suppose I am still a Graduate Student, as I haven’t actually received my sheepskin in the mail yet). There I found myself in discussion with a Well-Respected Member of my Field. The discussion was going well, at first, but I eventually found myself in an error-catastrophe loop. I was trying, with great desperation, to explain the predicted relationship between temperature and body size, and I eventually asserted that the negative relationship that I expected to observe was based on a physiological constraint that maps body size to rearing temperature… at this point the Well-Respected Member of my Field cut me off, furrowed his brow, and replied:
“But everything is adaptive, if you go deep enough”.
A silence fell between us. I stared at him the way a duck
might stare at a mirror. Around us, the drone of scientific murmur persisted,
coffee cups clinked, and off in a corner, a fellow Graduate Student rifled
stale muffins into his conference handbag. The Well-Respected Member of my
Field looked at me expectantly. I became increasingly apprehensive. Then, not
knowing how to dabble my next quack, my inexorable devolution from Graduate Student,
to Duck, to Donkey occurred near instantaneously, as I blurted out,
“Yeah, you should
publish that one in Nature”.
This reply was of course ludicrous, and it was made in an
adrenaline-induced explosion of anxiety. At worst, it was unfair, because an
association between a penchant for adaptive explanations and publication
prowess was implied.
But I think we all have at least some idea of what “Everything is adaptive if you go deep enough”
was meant to get at. In retrospect, I think my colleague just wanted to
underline the fundamental role that natural selection has played in shaping the
biology of all organisms, from the physiological factors governing cellular
processes to realization of these processes at higher phenotypic levels. I have
no problem accepting this general argument, nor do I think would most
biologists question selection as an important evolutionary force in this sense.
But this ill-fated conversation got me thinking: how “deep” can we go before invoking
adaptation as an explanation for phenotypic variation is probably ill-conceived? Or,
said slightly differently, is there a level at which it is reasonable to
believe that the variation we observe is generally the result of constraint? In
my view, parsimony helps inform this end, such that there does come a point at
which non-adaptive explanations can become a null hypothesis to explain
phenotypic variation. Specifically, I argue that for functional traits, null
hypotheses for variation among species should be adaptive ones (such that the
non-adaptive hypothesis bears the onus of proof), whereas null hypotheses for
variation within populations should be non-adaptive ones (such that the
adaptive hypothesis bears the onus of proof); interestingly, for
phenotypic variation among populations of the same species, parsimony is of much less value.
I was reminded of my ill-fated conversation with the
Well-Respected Member of my Field shortly after my PhD defence, in which I was examined
by a different Well-Respected Member of my Field: Andrew Hendry. I was thrilled
that Andrew agreed to be my PhD examiner. I have the utmost respect for Andrew,
as his research on investment per offspring ("egg size") and reproductive allocation paved
the way for my PhD research. His work has been tremendously influential, both
in the field of life-history tradeoffs and in terms of how I’ve come to think
about evolution of investment per offspring. Andrew gave a terrific examination,
but he certainly didn’t like the conclusion of my thesis. Motivated partly by
this (increasingly infamous) chapter, Andrew has since started a discussion
on the extent to which parsimony is a useful tool for evolutionary ecologists,
and whether adaptive explanations are generally any more or less parsimonious
than explanations that invoke constraint. So, I’m digging in my heels herein,
as I attempt to give a reasonable response.
Electrofishing for juvenile Atlantic salmon during my thesis research |
Before I recapitulate the conclusions I drew after being told “Everything is adaptive, if you go deep enough”, I should defend the use of parsimony in evolutionary ecology. My view is that parsimony is one of many useful tools that the evolutionary ecologist should keep in his toolbox. Certainly, a blanket application of parsimony in evolutionary ecology is unwise, as parsimony will be more useful in some cases, and less so in others. For instance, the use of “maximum parsimony” to generate phylogenies is reasonable because it assumes, rightly I think, that it is statistically less probable for characters to revert back to their ancestral state after undergoing a change (at least, this is my understanding).
But phylogenies are perhaps a special case in which parsimony
is extremely useful. When evaluating the adaptive significance of phenotypic
variation, on the other hand, to “express greatest confidence in the simplest hypothesis”
is not always a wise approach. This is because parsimony, at least as I see it,
is rooted both in probability statistics and scientific reasoning. That is, the
plausibility of an adaptive hypothesis (or a non-adaptive one) is a product of
both the number of assumptions and the viability or nature of those
assumptions. I realize that this is almost like saying that “parsimony” in this
context is the same thing as “what I think is reasonable”, which is an
objection already raised by Andrew. Notwithstanding, I think the argument I
develop below still has merit even if we adopt a more objective definition of
parsimony (i.e., relating strictly to the number of assumptions in a model).
I do not dismiss the value of developing a complex
explanation or a complicated model as an a
priori hypothesis in evolutionary ecology. Many models, no matter how complex,
will ultimately prove to be useful even if they are strictly incorrect. However,
I think it’s important to acknowledge the assumptions in complex models are
often complimentary and difficult to test in isolation, such that this
approach, while potentially useful, can also muddy the waters without much
empirical justification for doing so. This is a particular frustration of mine,
because I believe that this is precisely what has occurred in my field of
expertise: a wide range of rather complicated theoretical models (based on
largely untested assumptions) have generated all kinds of generalized adaptive hypotheses
that are extremely difficult to test. Parsimony will not be an exclusive means
of eliminating these complex adaptive hypotheses, but parsimony can and should
be used in conjunction with whatever empirical evidence is available to help
choose among competing models.
Atlantic salmon eggs vary greatly in size, and there's all kinds of adaptive and non-adaptive explanations as to why this is... |
I also believe that there is some value in developing and testing simple explanations before moving on to more complex ones. In a reductionist framework, testing ideas that incorporate the fewest assumptions will, I think, lead more frequently to an unambiguous result. A clear result can then inform a linear and logical development of an idea, such that knowledge is acquired by the systematic elimination of the unlikely and the impossible. In this framework, the foundation upon which an idea is based is sound, such that any extension of the idea is well founded. Admittedly, some might want to label this notion as something like “misguided idealism”, but my point is that we should think very, very carefully about the number and nature of our assumptions as we develop hypotheses. This is the motivation behind Burnham and Anderson’s information theoretic approach for model selection in hypothesis testing, and I see no good reason why parsimony should not also be a consideration while evaluating the merits of adaptive and non-adaptive hypotheses.
So let us assume that parsimony is one of many useful tools
that may help us discriminate among competing hypotheses. Even if this is the case,
as Andrew points out, inferring the validity of adaptive and non-adaptive
explanations based on parsimony is difficult, because explanations can be
difficult to categorize as “parsimonious” or “less parsimonious”. The issue, as
I see it, is that parsimony is strongly tied to the degree to which we
understand adaptive and non-adaptive processes in the first place. The number (and
nature) of assumptions that need to be made, and hence the degree to which an
explanation is parsimonious, is directly linked to our understanding of the
mechanism we’re invoking. It’s a quagmire, to be sure.
Notwithstanding, I believe that the argument can be refined.
To recapitulate what I concluded after being told that “Everything is adaptive, if you go deep enough”, my view is that the
extent to which non-adaptive explanations are parsimonious is related to the
scale at which variation is identified, or how “deep” we delve into the levels of variation. My argument is a qualitative
one, and I don’t think that it will resolve the differences in opinion that
Andrew and I have. I do, however, believe that the issue of scale is important,
and it might at least provide some insight into the issue of adaptation,
constraint and parsimony.
If we consider the mean value of a trait in two different
but related species, I think that the majority of biologists might agree that,
in most cases, adaptation provides the simplest explanation for the difference (e.g., competition that results in niche
partitioning). I think most biologists would agree that non-adaptive processes,
such as drift, are in most cases not a simple or logical explanation for
interspecific differences in, say, age-at-maturity or fecundity.
Does the extent to which parsimony informs a null hypothesis depend on the scale at which the comparison is made? |
However, as Andrew alludes to, there might be far less consensus on what type of explanation is most parsimonious when variation in a mean phenotype is observed among populations of the same species. I think Andrew is quite correct here. Personally, I might venture, as Andrew does, that adaptive explanations are often most parsimonious in these cases, especially for life-history traits. But I think it ultimately depends on our understanding of the adaptive and non-adaptive mechanisms invoked, in conjunction with the demographic history of the population, etc. Thus, even if we accept parsimony as a useful tool in evolutionary ecology, I agree that it is not clear that non-adaptive explanations are generally more or less parsimonious at this scale.
But how much “deeper”
can we go before invoking adaptive explanations for phenotypic variation becomes
markedly less parsimonious? What about phenotypic variation among individuals within
a population? If we’re discussing adaptation in a meaningful way, then I think
the argument is that the focal trait is optimal or near optimal, such that
variation among individuals represents different individuals converging on
different phenotypic optima that exist within a
population. This is, in fact, an argument that has been made time and time
again in my field. Specifically, there are a number of rather complicated
theoretical models which demonstrate that individuals can achieve the highest absolute
fitness if they produce a particular egg size in conjunction with the expression of a particular set of other phenotypes, and
that (therefore) variation in egg size among individuals may represent adaptive
variation. Yet, in the vast majority of cases (especially those in which
frequency-dependent selection and adaptive plasticity are unlikely), viewing
phenotypic variation at this level as adaptive is mind-numbingly complicated,
as the number of assumptions needed to make the model realistic is stupendous. The
nature of the underlying assumptions can also be questionable; for instance,
none of the theoretical models in my field incorporate the evolutionary
genetics of investment per offspring, nor do they consider whether the model is
valid under any given population-genetic scenario. Is it not both biologically
plausible and simple to expect that variation at this scale represents, say, genetic
covariances among traits that limit the extent to which any given functional trait
can achieve its univariate fitness peak? Isn’t life-history theory rooted in
the idea of trade-offs, such that there is an expectation that not all traits
can be simultaneously optimized in an individual? Again, without drawing on
specific examples, this argument is necessarily qualitative, but I nonetheless
believe it to be a compelling one.
So, if you think that parsimony is at least useful in
evolutionary ecology, as I do, then perhaps you also agree null hypotheses for
variation among species should be adaptive ones (such that the non-adaptive
hypothesis bears the onus of proof), whereas null hypotheses for variation
within populations should be non-adaptive ones (such that the adaptive
hypothesis bears the onus of proof).
I’m glad that Andrew has blogged about this, and I realize
that my own response was a bit tangential and that I did not address all of the
issues that Andrew brought up. I think it’s well worth discussing this subject
in general, and I hope I am not the last to give my thoughts here on the blog.
Njal
Njal, very thought-provoking post! I think that I disagree with you, although you raise many good points.
ReplyDeleteYou say "Isn’t life-history theory rooted in the idea of trade-offs, such that there is an expectation that not all traits can be simultaneously optimized in an individual?" I would certainly agree with that (although empirical evidence for such trade-offs is surprisingly thin, as far as I could find in a recent lit search). But if constraints and trade-offs are universal, then won't that necessarily mean that many differences between species are not, in fact, adaptive, but are merely the result of selection on other traits, plus ubiquitous pleiotropy and trade-offs?
At the other end of the spectrum, you argue that within-population variation is unlikely to be adaptive. But it seems to me that negative frequency-dependent selection is probably very common in nature (after all, it can be generated by processes ranging from competition to predation to parasitism to sexual selection, and more), and that it can often provide a compelling adaptive explanation for within-population variation. This hasn't been tested empirically very often, but when it has been (e.g. Schluter, Bolnick), positive results have been common. And the alternative position seems hard to argue: that some of the individuals in the population are substantially more fit than others, but that stabilizing selection is somehow unable to weed out the losers. That would make sense in high gene flow source-sink situations, but otherwise, mutation is just too weak to maintain the within-population diversity that we typically see, it seems to me.
Now, I make no claim that the above arguments are airtight. I only claim that they are plausible: it is plausible that differences among species are commonly non-adaptive, and that differences within populations are commonly adaptive. That can be reasonably argued. And this is the problem with parsimony-based arguments: reasonable people can disagree as to which explanation is the more parsimonious, except when parsimony is defined in some rigorous mathematical way, as in information theory.
I think rejecting theoretical models based on parsimony alone is deeply problematic for that reason. The role of theory is to show what is possible, not what is probable. To determine what is probable, empirical data is needed. When empirical data is hard to get, that can make it difficult to choose among competing theoretical models; but I don't think that ought to be taken as justification for weeding out models using ad hoc parsimony-based arguments. Instead, it should be taken as a challenge for the empiricists! :->
Very nice discussion! Thanks guys. Here are some thoughts related to an article I read some time ago on the topic: I find that part of the problem lies in the modus operandi of science. The first thing science does is to 'take apart things' (interestingly this is reflected in the syllable 'sci', as in 'scissors' or 'schizophrenia'). Then, after observing particular phenomenons and variables in this very detailed way, science has to reverse the process and generalize these observations to build theories that might explain the bigger, broader picture.
ReplyDeleteI think the problems surrounding parsimony in scientific reasoning are somehow located in this transition process from the specific to the general. To me it seems that parsimony is a very good first heuristic to guide our approach to specific observations and our model explanations for them. However, if we zoom out of the details and try to generalize, parsimony should by no means be seen as a general principle. Parsimonious approaches in particular cases do not have anything in common and parsimony should not be seen as a shared property of different cases. It is in fact one out of many heuristics that are needed to gather information to build hypotheses and it has to be rejected (of course) once we figure out that more complex interactions are at work than originally thought. Simplicity always has to depend on the context and is therefore not general. An interesting and more in-depth discussion on this can be found in this article by Elliot Sober: Let's razor Ockham's razor.
Looking forward to read more contributions. Gregor
Great comments guys, keep 'em coming. It's good to hear from other EvoEcos, and I'm interested to find out whether I'm in the minority on the parsimony issue.
ReplyDelete