Double-blind* review is widely seen as a positive step towards a fairer system of publication. We all intuitively expect this to reduce implicit bias on the part of reviewers, making it easier for previously underrepresented groups to publish. But, is that true? Not surprisingly there's a fair amount of research on the impact of double-blind. Here are a few links:
The Case For and Against Double-blind Reviews is a new BioRxiv paper that I learned about on Twitter, which finds little benefit. But, the paper isn't well replicated at the level of journals, and lacks information on submission sex ratios.
The effects of double-blind versus single-blind reviewing is a classic 1991 experimental study from economics, which found that double-blind led to more critical reviews across the board, equally for men and women, and lower acceptance rates. The strongest drop in acceptance rate was for people at top institutions.
In contrast "Double blind review favors increased representation of female authors" followed the 2001 shift by Behavioral Ecology to double-blind review, finding an increase in female authorship. But again its not clear whether this is an increased acceptance rate, or an increased submission rate.
Then there's a meta-analysis of the issue that found fairly ambigious evidence, though with some evidence of bias (especially in favor of authors from top-ranked universities).
In short, the literature on the literature isn't a slam dunk. Most people tend to agree that double-blind is a good thing. There are some administrative costs (time editorial staff spend policing the system), and time authors spend blinding their own work. But all in all it seems worth it. Indeed, some people won't review for journals that don't do double-blind. (although some people refuse to review when it IS double-blind, a catch-22 for journals who just want to find good qualified reviewers).
To wade into this, I wanted to offer a bit of data from The American Naturalist. The journal went double blind in early 2015, one of the earlier journals in out field to do so, but not the very first to be sure. There's a blog post by Trish Morse in Nov 2015 summarizing the results in the first 10 months. I have the luxury of being able to reach into our historical database and see what double-blind review is doing. The journal is especially valuable in this regard because we have an opt-out double-blind review policy. Our default is to go double-blind, but authors may choose to ignore that, and some do. This gives us an imperfect but useful comparison between those who do and do not opt for double blind. What's their acceptance rate, how does this depend on gender of the first author?
I'm lazy, doing this on a Sunday afternoon while my kids are watching a movie, so forgive me some imperfections here. This isn't a peer-review-ready study.
I looked back at the 500 most recent acceptances at AmNat. The actual number is a bit less than this because some things in the database include editorials, notes from the ASN secretary, and so on, that I don't want to count. I also looked at the 500 most recent declines at AmNat (including Decline Without Prejudice that sometimes turn into subsequent acceptances, but I didn't have a simple way to trace this). I did my best to infer the gender of the first author based on their first name. I didn't count papers where I couldn't tell.
Note that because we decline more papers than we accept (20% acceptance rate), the ~500 acceptances covers a couple-year period, whereas the 500 declines were all from 2018. That's not ideal, I know, but see the first Methods paragraph above. That also means that the exact acceptance rates here are not exact values; it is their relative values (double-blind or not; male vs female) that are useful for us. Here's the data with marginal totals
|Opt out of double blind||Accepted||84||40||124|
To digest this for you a bit, here are a few observations:
1) As we might expect, women were less likely to opt out than men
This is significant (Chi-square test P = 0.017), though the effect size is modest (7%). This fits with the notion that double-blind is fixing a bias problem and should protect female authors whereas men are privileged to have the option to be identified without harm.
2) Double-blind papers are less likely to be accepted than opt-out papers (P = 0.002). This is partly because things like the ASN Presidential Address articles and other invited papers of award winners are by necessity not anonymous, and have a higher acceptance rate. But note that it seems like women get a stronger benefit from NOT double blind than men do (though this is not significant). So, there's clearly a cost to going double-blind which is it seems to hurt authors prospects overall. And the most widely-cited benefit (reducing bias against women) does not seem to be visible for our journal in the time period covered here. That matches the experimental study from economics from 1991, linked to above, which also found double-blind reduces acceptance rates. Here's the detail:
Remember, these acceptance rates aren't the overall acceptance rate of the journal, because I chose to examine 500 accepted and 500 declined papers. But their relative values are telling: opt-out is the way to go, It's tempting to suggest that might be especially true for woman authors, but the gender by review-type interaction is not significant in a binomial GLM. So it doesn't seem to matter. There's no gender difference in acceptance rates in double-blind papers. There's no gender difference in acceptance rates in non-double-blind. But, double-blind reduces everyone's chances by a bit. Not much, but...
3) And now the bad news: we still aren't at gender parity in submissions. Overall in the time period covered by this survey, 36.5% of our SUBMISSIONS were by women first-authors. That's not good. I'm not sure what to do to fix this, because it concerns the pipeline leading up to our front door. So I;ll be continuing my attempt to be proactive at encouraging people, especially students, and women, to submit articles to us. The good news, I can tell them, is we have strong evidence there's no gender bias between submission and decision.
So, why should double blind reduce acceptance rates? That's odd at first glance. But as Editor I've received notes from reviewers. Some say they won't review something because its not-double blind. But quite a few have told me they prefer to review non-blind. They note that if they are aware the author is a student, for example, they are more likely to go an extra mile to provide guidance on how to improve the paper. Now, I would hope that we would do that for everyone. All our reviews should clearly identify weaknesses, and should point towards paths towards improvement. But the truth is we feel protective towards students, including other people's students. There's only so much time each of us can afford to put into reviewing (though a ton of thanks to the many AmNat reviewers who put their heart and soul into it). So we make decisions on how much time to invest in a given paper. Knowing someone is a student can make us be a bit more generous with our time, which might in the long run help them get accepted. When that information is obscured, perhaps reviewers become equal-opportunity grinches.
Interestingly, a year or two after AmNat went double blind, our Managing Editor and lodestar Trish Morse looked at the results. There was a dip in acceptance rates then, as well. It wasn't quite statistically significant, and we wanted to accrue a few more years' worth of data. But it matches what I find in the more recent data.
An alternative hypothesis is that people opting out of double-blind are famous, and influential, and more likely to get a pass. That fits a bit with the observation that men are more likely to opt out. But who are those people? I obviously won't name names. What I will say is that I was surprised. Sure, there were big names who opted out (some by necessity, such as authors of the ASN Presidential Address articles). But there were also many lesser-known authors opting out. And many big-names who went ahead with double-blind. In fact, only a small minority of the opt-out crowd were celebrity authors. Many opt-outs were actually by authors from non-US and non-EU nations who might not be as aware of the double-blind cultural trend.
To summarize: Double-blind seems to slightly reduce everyone's acceptance rate regardless of gender.
That matches results from Economics in the late 1980's. Not a strong endorsement of double-blind, which we tend to think of as fixing gender (and other) bias in publication. For the past 1000 decisions I don't see evidence of such bias. So did we adopt double blind to fix a problem that itself has faded away (# see footnote below)?
Some important caveats:
1) I didn't look at last author or other author names. I didn't categorize by career stage, or university.
2) As noted above, I sampled 500 accept and 500 decline, not over exactly the same time period.
3) AmNat practices reviewer blind. But the Editors (e.g., me, Judie Bronstein before me, Russell Bonduriansky, Alice Winn, Yannis Michalakis) and the Associate Editors can see author names. That's by necessity: someone needs to be able to choose reviewers without asking the authors or their advisors or closest colleagues to do the job. That requires knowledge of names.
4) This might be worth a more careful analysis and publication, but I don't have the time & energy to do that right now. And its not ethical to give someone else access to all our data on accept/decline decisions and author identities.
* I have been told that the phrase double-blind is abelist, and upsetting to some visually impaired people. This is the phrase we have inherited and we discussed last year switching to doubly-anonymous or something like that, but once a term becomes entrenched it is hard to change.
# There are clearly other barriers still present, generating the strongly significant male-bias in the papers that come in our door. These need to be addressed proactively. So my comment that double-blind might be meant to fix a problem that has faded away refers only to the review and decision-making process as a statistical aggregate effect. I also recognize that if there were one or two sexually biased reviewers, affecting decisions on just a few papers, that would be undetectable in this statistical analysis yet still constitutes a problem worth fixing.
*Footnote: the Dude abides, also