Wednesday, May 12, 2021

17 months

By Dan Bolnick

This past month, The American Naturalist published what I hope is the final step in the Editorial Board's evaluation of work by Jonathan Pruitt, 17 months after concerns first came to my attention. This wraps up the journal's (and, hopefully, my) role in this process, after two rounds of institutional investigations. The first round, conducted by authors and a small group of the editorial board, was made publicly transparent in January-March 2020, after we had reached clear conclusions. Due to the chilling effect of legal threats we launched a second more extensive institutional investigation in March, which predominantly confirmed the conclusions of the first investigation. However, that second investigation was entirely behind closed doors. However, I think it is important to convey some of the lessons learned during this  entire process, about the intersections between science, rigor, transparency versus confidentiality, journalism, and legal risk. It is my hope that the following inside retrospective view of the process will (while respecting confidential aspects like some whistle-blowers identities) prove instructive to people interested in evaluating future claims of misconduct or error.

Before I launch into my narrative, I want to emphasize several key points:

1) I have been involved in this saga from several perspectives. I am Editor In Chief of The American Naturalist, and so have been central to investigating six of his papers. I will seek to avoid saying anything here that would break reasonable expectations of Editorial confidentiality. But, it is also essential that editorial processes be transparent enough to the community to engender trust in the fairness and rigor of the journal. I also was a co-author of Pruitts on one paper in Proceedings B. Lastly, Jonathan Pruitt had been a personal friend of mine, and at no point did I wish him any ill or wish to harm his career. It pained me to see his work collapse, which I had so respected once, and I tried hard to give him paths to come clean. 

2) For the purpose of this essay any opinions are my own, as a private individual with professional expertise in the biological and statistical fields in question, exercising my First Amendment rights to express my views. I will try throughout to flag clearly anything that is 'mere' opinion, but I will seek to primarily stick to  factual statements about documented processes and events that I have had a front seat view of as Editor or a co-Author. My experience as editor cannot be separated from my experience of this series of events as a whole, but this blog post is being posted on EcoEvoEvoEco and not on the American Naturalist journal's editor blog, because I am posting this from my personal perspective, not as an official journal statement.

3) Where I discuss flaws in data, below, I am going to stick solely to what I see as my reasonable professional judgement about what is biologically or statistically plausible. Where data contain implausible patterns, the scientific inferences arising from them are suspect and corrective action is needed by the relevant journal, which has been in part my job. How those flaws came to exist is a separate question about which I will not speculate. That is a matter for McMaster University's investigative team to determine whether the biologically implausible data are the result of conscious fraudulent manipulation of data files, or whether they result from accidental mismanagement of data (e.g., errors in transcription of data from paper records onto spreadsheets). From the standpoint of scientific conclusions, which was my focus, the distinction is irrelevant because either way the biological inferences are not reliable.

I begin my retrospective with a rough overview of the time-line of events, as documented in a search through my email records (which I assiduously kept, and which have been the subject of a massive Freedom of Information request by Pruitt, presumably to seek evidence against me for legal action). I then move to a description of some of the major lessons that I've learned in the process. In a separate essay I ponder some questions about the ethics and philosophy of retraction.

1. The Chronological Narrative

This chronology is a greatly streamlined version, based on a 112-page document I wrote in early May 2020, in response to a letter from Jonathan Pruitt's lawyer to the University of Chicago Press, demanding that I be removed from making any decisions about the case (we'll get to that...). And of course with additions concerning the events between early May 2020 and today. 

1.1 Prehistory. Jonathan Pruitt and I share academic interests in behavioral variation within populations. In the late 2000’s he applied for a postdoctoral position in my laboratory, and I was deeply impressed by his research (though he made the short list, I took on another candidate). Shortly thereafter we met when I was visiting the University of Tennessee where he was a finishing graduate student, and we had an excellent conversation. I followed his career closely thereafter and was consistently impressed with the acuity of his research questions and the elegance of his experiments and clarity of their results. We had the opportunity to meet again when he hosted me as a visiting speaker at the University of Pittsburgh, and our interactions were very positive, and I in turn hosted him as a seminar speaker at the University of Texas at Austin. I considered him a good friend, on the rare occasion we crossed paths at conferences we would always go grab a beer and talk interesting science questions. We also collaborated on a publication (Pruitt et al 2016, Proceedings of the Royal Society of London Ser. B), in which he had invited me to comment on and join an in-progress paper he was working on with data he generated. On the basis of this research record I was happy to write him letters of recommendation for promotion and tenure, first at the University of Pittsburgh, then again at the University of California Santa Barbara.  I nominated him for a prestigious 2-year Harrington Fellowship visiting research position at the University of Texas at Austin (where I was working at the time), which he received. However, he did not take the position because he instead was offered an even more prestigious Canada 150 Research Chair (for which I also wrote a letter of recommendation). I wrote another letter recommending him for a Waterman Prize from the National Science Foundation in 2015, at the request of others organizing this nomination. And, in fall 2019 I was involved with a group of faculty who sought to nominate him for a Mercer Award from the Ecological Society of America. I have copies of these letters. I mention this history to establish that far from being a enemy seeking to undermine Dr. Pruitt, I have been a tireless proponent of his career for a decade, going above and beyond the call of duty to advance his reputation and job prospects. 

1.1 Genesis. On November 17, 2019 I received an email from Niels Dingemanse, alerting me to concerns about Laskowski et al 2016 (AmNat). Kate had already been notified, and was cc'ed on the email. Two days later, Jonathan emails me independently to say he had heard of these concerns. He provided an explanation that later proved to be (1) not in the originally published methods, and (2) unable to explain the concerns Niels raised. Specifically, he said spiders were timed in batches, and all responded simultaneously in a group, which is why certain time values occurred for many different individual spiders.  I asked Jonathan, hopefully, if he had videos of his behavioral trials, and he said he didn't. A week later (November 26), Erik Postma provided a detailed R code documenting their concerns (via Niels),  showing that the excess of duplicated numbers were not restricted to the batches of spiders measured together, undermining Jonathan's claim.This was the point where I began to be genuinely alarmed, because the failed attempt to explain away concerns struck me as indicative of a much deeper problem. At this point I sought advice of current and former Editors of the journal. One of them suggested I recruit some external help evaluating the concerns. I did, but while waiting for their evaluation events moved ahead without me. Jonathan sent me a Correction, in which he averaged spiders with duplicated values to avoid the pseudoreplication that would have plagued his published analyses (if we accepted his explanation for timed groups). Soon after, Kate Laskowski (first author of the paper in question) emailed me to request retraction. I want to emphasize this point, the retraction came from the authors and the wording of Kate's emails made it clear that Jonathan agreed to the retraction. These emails were also the first time I learned that other papers, at other journals, were affected by similar concerns. The same day (December 17th) Jonathan emailed me directly confirming in writing that "We're therefore too iffy now to continue forward with a request to merely correct the manuscript, and would favor retracting the article instead." I immediately forwarded their request to the University of Chicago Press, who replied that we needed a retraction statement written by the authors. At this point we no longer needed an outside opinion that I had sought, so I cancelled the request. Note that all emails concerning the events described in this document are retained to prove my statements if needed. If there's one lesson I've learned from this mess, it  is the value of keeping all emails.

As an aside - Pruitt has frequently been asked to provide original paper copies of data to validate the information in the digital Dryad repositories, and so far has not done so. I did get confirmation from some former undergraduates in his lab who worked on the affected papers, and who stated "we always collected data in the lab on paper data sheets". They also challenged his claim that spiders were tested in batches of 40 (which he used to explain duplicated numbers, because all spiders in a batch might respond simultaneously and be given the same time). They stated that "the standard practice was to assay between 5-8 individuals at a time, each with a dedicated stopwatch".

In a minor irony, while we were processing the retraction statement at the journal, Niels emailed me to express concern about my friendship with Jonathan, and that maybe I would be too lenient towards him and should get another Editor to handle the case.

In early January 2020  I was contacted (note the passive voice) by the editor of one of the other journals considering concerns about a paper where Pruitt conveyed the data. We did not seek to affect each other's decision on the case, but simply discussed the process we were separately using to reach a decision.  Shortly after we became aware of a third affected journal. At this point it was clear that there is a repeated pattern that transcends journals, which (per guidelines from the Committee on Publication Ethics, CoPE) merits communication between Editors. We also decided it would be essential at this stage to notify the author's current and former institutions that three retractions were in the works. It is no particular secret, I believe, that I sent the emails to the Scientific Integrity Officers of McMaster, UC Santa Barbara, and the University of Pittsburg, and to his current and former department chairs. I feel this was an obligation on me as Editor, aware of scientific integrity concerns about their current/former employee.

On January 13th, Jonathan emailed both Spencer Barrett (Editor of Proceedings of Royal Society B) and I, to say (quoting just a part here): "Thanks again very much for working with us so swiftly to process these retractions." Also in that same email, Jonathan raises the topic of "revisiting data sets old and new to look for similar patterns" - something I had not yet thought to do systematically. Just the day before, one of Pruitt's co-authors had emailed me that an exchange with Jonathan "gave the impression they may not be accidental". 

The first retraction became public on January 17, 2020, for Laskowski et al 2016 American Naturalist.

1.3 Collateral Damage. In mid-January, I had a phone conversation with Dr. Laskwoski, who was concerned about the damage that the pending retraction would have on her career. I sought to reassure her that it is possible to survive retractions, and that in my personal opinion the key was transparency and honesty, which the scientific community would appreciate.  I had voluntarily retracted a paper of my own a few years previously due to a mistake in R code in 2008, when I was first using R for data analysis. At the time I had written a detailed blog post explaining the retraction, and was proactive about advertising the retraction on Twitter. The community responded very positively to that transparency, and I felt that no harm was done to my career as a result. I relayed that experience to Dr. Laskowski, as a possible strategy to use transparency to gain community support for the retraction process. Based on that conversation she began to consider a blog post or tweeting about the retractions. I want to be clear here that the goal wasn't to cast aspersions against Pruitt, but to clearly articulate the concerns about the data, and the reason for the retraction. For instance, I wrote: "I do think that there will be questions about WHY the paper is being retracted. In that case Kate's choice is either: 1) be entirely silent on why 2) say that the issue is being investigated and so she does not want to comment on the details 3) explain the problems in the data without openly saying that this constitutes evidence of any wrongdoing. I think (2) is the worst possible option for Jonathan, as it implies wrongdoing without explanation. So, as I think about it more I think a clearly explained summary of why the data were judged to be unreliable (option 3; maybe with screenshots from the dataset, especially the second sheet) would be the most open and transparent approach...I’ve come around to saying that being open about this is the best course of action at the moment, while carefully phrasing this to not make accusations". Kate did end up writing a blog post timed to come out with the second retraction (from Proc B). She did ask me for comments on it; I provided a very brief email feedback, but no major input on content or style. I retain a copy of that email and can prove that I provided no substantive guidance about what topics to put in or omit, or what to say.

1.4 Evaluation. On January 18th I learned that another journal was beginning a systematic evaluation of papers by Pruitt. Up to that point I had not planned to do so for The American Naturalist, mostly because I was still focused on managing the first example and hadn't come up for air to consider the bigger picture. The same day, Associate Editor Jeremy Fox emailed me to ask about the retraction. It occurred to me that Jeremy would be a good person to ask to evaluate the other data files for the other AmNat papers, because he didn't know Pruitt personally, wasn't a behavioral ecologist, and so would be entirely outside the realm of personal or professional conflicts. Jeremy got quickly to work and rapidly raised concerns about multiple other American Naturalist papers. On January 19th he raised concerns about Pinter-Wollman et al 2016 American Naturalist, representing the first clear indication that problems transcended a single paper at this journal. January 21st, Jeremy indicated he found no evidence for problems with the data in Pruitt et al 2012 AmNat (with Stachowicz and Sih). This paper did end up receiving a Correction from the authors and an Editorial Expression of Concern, as I'll detail below. I point this out because [1] it took about 14 months between Jeremy first looking at the data, and my reaching a final decision, because this was a particularly tricky case that one might reasonably argue should have been a retraction. Having finished evaluating AmNat datasets, Jeremy kept digging out of a concern that papers at other journals might not be recieving the evaluation they need. On January 21st he let me know about formulas embedded in a Dryad posted Excel file for a Journal of Evolutionary Biology paper, in which Pruitt had calculated the *independent* variable as a function of the *dependent* variable in his analysis. Speaking personally here, it sure looked like formulas were being used to fabricate data, but Pruitt as usual emailed me an explanation that I cannot directly evaluate or reject. I passed Jeremy's concerns about this paper on to Dr. Wolf Blackenhorn on January 22nd, which seemed especially egregious. This was the only instance in which I conveyed initial concerns to the Editor of another journal. 

February 4th, I received the second retraction request concerning one of the Pruitt papers in The American Naturalist, from Leticia Aviles. Co-author Chris Oufiero responded to agree. On February 11th I replied that I would like to recieve a written retraction statement for publication (unanimous if possible, but not necessary). The same day the remaining author (and Pruitt's PhD advisor) replied also confirming that she believed retraction was warranted (a position she reiterated on February 27th). The authors did not provide a retraction statement, until late fall, for reasons that will become clear further in this narrative. On February 6 I received an email from the lead author of Lichtenstein et al (2018 AmNat) also indicating that he felt that retraction was warranted based on flaws identified by Florence Débarre. Again, this initial momentum was soon derailed, but at the time it seemed like the strategy of relying on co-authors to evaluate and decide whether to correct or retract (if either was needed), would be effective. Our institutional investigation (by Fox and myself) had found problems, co-authors agreed, and co-authors were deciding to retract. On February 10th I received an email from Noa Pinter-Wollman asking to have a correction for Pinter-Wollman et al 2016 AmNat, with the agreement of co-authors. Yet again, my request that she submit a text Correction for us to publish was disrupted. If you can't stand the foreshadowing, jump down to the section on "Chilling Effect".

Starting on January 20th I began receiving whistleblower emails from numerous sources expressing concern about Pruitt papers at The American Naturalist, Nature, PNAS, Behavioral Ecology, and other journals. I did not pass these on to the journals in question, but encouraged the writers to do so. Shortly thereafter I started receiving emails from journal Editors (I did not initiate these contacts, contrary to claims by Pruitt's lawyer, which we will get to). Niels Dingemanse and others had begun emailing numerous Editors of various journals alerting them to concerns about papers in their journals, and the Editors (being aware of the AmNat retraction) checked with me to ask whether I considered the concerns legitimate, and how I was proceeding. I confirmed that they should examine the cases and come to their own conclusions, and gave them some advice about how we had proceeded. The most striking thing I noticed, to which I return later, is the divide between those journals that required data archiving years ago (which could evaluate concerns) and those that hadn't adopted the policy (some still hadn't as of these events). I should also note that I argued for due process in each case, for instance indicating that a journal which had Pruitt on its editorial board shouldn't summarily dismiss him, but would be better off with a hiatus to wait on McMaster University's investigation results (something which is still ongoing). I argued we shouldn't presuppose the outcome of their investigation, and should avoid a witchhunt mentality. I continued to be cc'ed or addressed by whistle blower emails for several months, including cc'ed on a complaint to Nature filed on January 30th 2020 (they posted an Editor's note in February 2021 indicating that an evaluation was in progress for Pruitt and Goodnight 2015).

January 29th, the Proceedings B retraction becomes public. Where before there was one solitary retraction, now there was a pattern of repeated flaws. Kate Laskowski tried to get ahead of this by publishing a blog post documenting her reasoning in depth (which she asked me to proofread, and I provided very light typo corrections on only). There is an emerging theme over the past year, that many retraction statements are brief and ambigous as to the scientific details. This is changing and more recent retractions and Expressions of Concern have been more forthcoming. But at the time the PRSB retraction was vague and Kates blog served to elaborate to explain in depth. Ambika Kamath and several other authors also posted a blog that same day that I was not aware of in advance. Also on this date, I asked a second Associate Editor (Alex Jordan) if he would be willing to take a second look at Jeremy Fox's findings, to see if Jeremy was being fair and thorough.  A day later, Current Biology notifies me of a pending retraction (later paused due to lawyer involvement), which I had not been aware of or involved in. A couple days later I also asked Flo Débarre to look at the data files because she (1) has no association with the intellectual subject matter, and (2) is very effective at theory and coding in R. Like Jeremy Fox, she quickly found numerous flaws and felt compelled to document them thoroughly. Within a week Flo had emailed me an detailed evaluation of 17 papers with Pruitt as an author, including the five remaining American Naturalist papers, identifying serious concerns affecting many of these, including four of the five AmNat papers. A typical example is provided here:



The two yellow blocks are supposedly independent replicates with exactly duplicated sequences of numbers. The same is true for the blue blocks, and the rose colored blocks.


1.5 Suggesting a mea culpa During the evaluation process within the journal by Jeremy Fox, another paper (2012) seemed to be problematic. Pruitt reported size-matched spider and cricket body masses that seemed implausibly precise, measured to a precision of 0.00001 grams (see image below). In my email exhange with Jonathan over this, asking for an explanation, I raised the question of whether he should own up to what data sets are flawed, to save the rest of us time. I wrote: "the behavioral ecology community as a whole is expending enormous energy the past week to dig into data files.  People are, understandably, grumpy about the extent of errors, and more seriously about the suspicion of deception, but most of all there is frustration over the impact this has on colleagues and the time that is being robbed of them even now to sort through the mess.... If, and I emphasize “if”, there is any truth at all to suspicions of data fabrication, I think you would best come clean about it as soon as possible, to save your colleagues’ time right now sorting the wheat from the chaff." 



Screen shot from a data file from a 2012 paper, in which spider masses were paired with cricket masses (columns M and P), and simply multiplying column M by 0.3 could precisely reproduce the measured cricket mass to a precision of 0.00001 grams (compare calculated column O against observed column P).

For a time, Jonathan expressed interest in using a platform like this blog to address the community. Andrew Hendry and I offered to post whatever he chose to say. Ironically, it was Andrew who pointed out that Jonathan might be in legal jeopardy (for instance if any of his flawed data were used to obtain federal research grants) and so he might want to talk to a lawyer before writing anything public. Yeah, well, you'll see for yourself how that worked out, if you keep reading.

1.6 Public Coordination.  On January 29th, a behavioral ecologist contacted me to suggest that I create a database to track papers that have been cleared of errors. Their motive here is worth quoting: "One idea I have is to set up a website or google doc that lists which papers have been retracted, which have been vetted and are now ok to cite, which are still in the process of being checked, etc.  I'm hesitant to cite any papers that may be unreliable, but I also don't want to deprive any legitimate papers of well-deserved citations, so I think this resource would be helpful“ (from an email addressed to me from a colleague). The following day, I created a Google Forms document to help track evaluations of papers. My motive was to establish the Google Forms database to identify papers that are considered in the clear, and to reduce redundancy in data investigations to minimize wasted effort. I did so as a member of the scientific community. I posted no content and provided no information that was not otherwise public, and allowed others to populate the table. Note that all of the above retractions, pending retractions, and whistle blowers preceded the online database. Because I did not curate the table, and did not personally check every claim added to it, this database later became Pruitt's primary line of criticism against me. Although this is clearly a exercise in free speech, and I posted nothing that was false or misleading (e.g., not libel or defamation), I later decided that because I could not vouch for other people's entries in the table (and though I'm not responsible for content other people add), I took down the table from any public access and subsequently refused to share it. This was, in my view, genuinely unfortunate for Pruitt's co-authors (and indeed for Pruitt himself) because most people seem to use guilt-by-association to judge all his papers, even when the data were generated by colleagues or students of his. Thus, citations to his work have been greatly reduced by the retractions, even to unretracted work. The core motive was to highlight papers that had been checked and found to have no flaws, especially those whose data were collected by other people, and thus encourage continued citations to their work. By removing the document in response to legal threats (again, even though I see those threats as groundless), I fear I removed a crucial tool in mitigating collateral damage to others.

The retractions, blog posts, and online spreadsheet attracted attention and on February 2nd I received requests for interviews by reporters for Science and Nature. The published articles did not always represent my statements accurately, a complaint also raised by Niels Dingemanse and others.

In the subsequent days I regularly received numerous emails each day from people identifying flaws in existing data repositories, or from Editors asking for advice. Additional concerns were raised about American Naturalist papers, prompting me to email co-authors on all his American Naturalist articles asking for them to examine their data and let me know if they have concerns. I specifically stated that guilt-by-association was not our approach. Here's a core text of these emails:

If you collectively conclude that you paper reports results which are fundamentally not reliable, and can document the specific reasons for this concern, then you should submit a retraction statement to the journal, which we will then check. If the Editors and the University of Chicago Press concur, then we will copy edit, typeset, and publish the retraction statement.

 If you believe that some of the results are not reliable for specific documented reasons, but core elements of the paper remain intact in your view, then we would be happy to consider publishing a correction.

 If you lack confidence in the data simply because Pruitt provided them, this is not in my view sufficient grounds for a retraction without specific evidence of wrongdoing or error. I would be willing to consider publishing a brief statement, under the guise of a Correction (which would be appended to the online & pdf paper), making a statement about your concern on behalf of some or all authors without specific evidence undercutting this particular paper’s conclusions.

 If you retain confidence in the paper in all regards, I recognize that readers may not reach the same conclusion. I would be willing to publish a brief Comment allowing you to effectively confirm validity. This is an unprecedented thing to do, but I think is warrranted in this unprecedented situation.

 You may of course choose to do none of the above. Whichever path you think is best, I’d encourage you to document your thinking fully, take time to judge, seek feedback from co-authors or others, and not rush into a final decision that you may not be confident about.

My preference at this point was for the authors to judge their own papers and request retractions, corrections, or statements clearing the work, as appropriate. One of six papers was already retracted, and we had received email requestsfor  retraction for two more papers and a correction (but the authors had not yet supplied a retraction statement).

The public coordination had another benefit: it generated the potential for Editors to consult with each other about best practices in handling the situation, which was new for all of us. In particular, Proceedings B notified me on Feb 5 of their procedure, in which they appointed a committee to generate an internal report, allow Pruitt to respond, allow co-authors to comment on the report and response, and finally for the committee to re-evaluate and make a recommendation. I had begun an informal version of this with Jeremy Fox first, then adding Alex Jordan and Florence Débarre. I made this a more formal committee on March 15 2020. I reproduce the entirety of my email here because I think it is a useful template for others in this situation.

Dear Flo, Jeremy, Steve, Emma, Jay, and Alex.

 I am writing to ask whether you would be willing to submit a report, to me and the American Naturalist journal office, evaluating and explaining concerns about the papers that Jonathan Pruitt has published in The American Naturalist (excluding the one that was already retracted, though your report can certainly comment on it if you feel that is warranted).

 I am asking the four of you because (1) Flo and Jeremy have both already expended significant energy analyzing Pruitt’s papers and datasets for this journal, and I’d like to see a document summarizing this evaluation. (2) Flo and Jeremy and Alex and Jay are Associate Editors invested in the journal’s success and scientific reputation, which stands to be harmed should scientifically flawed papers go uncorrected or unretracted. (3) Flo and Jeremy are both distant from the behavioral ecology field and do not know Pruitt personally, and so have no formal association with any intellectual disputes nor any reason to harbor personal biases. (4) Alex and Steve and Emma are very close to Pruitt’s intellectual field and so are well placed to contextualize the concerns in terms of their intellectual value and to evaluate technical aspects of conducting behavioral ecology experiments in practice. (5) Jay and Alex and Steve both do know Pruitt personally, and to my knowledge have no personal reason to hold biases against him (please correct me if that is incorrect), and (6) Steve and Emma are not AEs for AmNat, and so I am hoping they can serve as an outside observer to confirm that there is no biased process and we are evaluating Pruitt’s work fairly and in a scientifically sound and rigorous manner. Lastly, Jay is both an AE, and a co-author, and former mentor of Pruitt’s who therefore could be expected to be a fair advocate for Jonathan but also a rigorous critic where criticism is needed.

 I am hoping a written report to me, as a single document, will:

1) identify and document any concerns for each of the remaining papers with Pruitt as an author. Flo has done a great job of this already with some online documents, so much of this is done.  Conversely, when you find no grounds for concern please do indicate this, and explain what you did to reach that conclusion.

 2) Treat each paper independently, in the sense that evidence of flawed data for one paper should not lead us to presuppose the other papers must be flawed as well

 3) Present a list of questions that we would need Jonathan to answer to clarify the nature of the problems (if any) identified in (1). He would be given two weeks to respond to those questions, then you would be shown his answers and given a chance to comment.

 4) If you identify concerns about a particular paper, please comment on your recommendation for a course of action. in particular, our options appear to be:

 i) there are no errors

 ii) any errors are minor and do not need any public comment

 iii) the dataset can be fixed (e.g., by excluding all duplicated values) and re-analyzed to reach scientifically reliable inferences that could be published as a correction.

 iv) certain parts of a paper no longer are reliable and we require a correction to indicate what elements of the paper should be retroactively considered redacted, but other aspects of the paper remain valid and useful scientific contributions.  Note that in my opinion, a novel idea or question is not sufficient to be published in the journal, that idea must be backed by an effective model or data. Therefore, a paper might contain an innovative hypothesis or viewpoint but if the data to demonstrate this point is flawed, then the paper should be retracted as opposed to simply issuing a correction that eliminates the empirical evidence.

 v) a retraction. Typically these should be submitted by the authors. They should succinctly explain the rationale for the scientific decision, without suggesting any cause for irregularities or leveling accusations about motive.

 vi)  An Editoral Expression of Concern.  As Editor, I have the right to publish an explanation, based on your report (you would a coauthor unless you opted to be anonynous which is your right), of concerns that lead us to question the reliability of a previously published paper. This is confirmed by the court case Saad vs the American Diabetes Association. For this, we do not require approval by any author(s) though obviously we’d prefer their agreement if we went this route.

If you are willing to do this, in the current troubled times of many COVID distractions, please let me know. If you cannot, i understand fully, these are remarkably challenging times to stay focused on work.

I would share your report with the editorial office (including University of Chicago Press lawyers, for our own collective peace of mind), then with Pruitt to request answers to questions you pose. Once we get his answers, you have a chance to respond to them. Then I will make a decision (subject to your recommendations) about options i - vi above, for each paper, and if necessary write Expressions of Concern or invite co-authors to write retractions or corrections. If your report judges retractions or corrections to be scientifically necessary, but the authors do not write retraction or correction statements (perhaps due to the chilling effect of Pruitt’s lawyer’s threats of legal action), I would opt for an Expression of Concern.

 Thank you for the help on this matter, so we can reach a transparent and fair and scientifically rigorous final decision on this matter.

I especially want to draw attention to the second paragraph where I outlined my logic in choosing these people - some because they are experts in behavioral ecology. Some because they are statistically savvy and far enough outside the field that they have no personal or professional bone to pick with Jonathan. Jay Stachowicz precisely because I might expect him to be sympathetic to Jonathan (a former postdoc of Jay's), for instance. I wanted to stack the deck in Jonathan's favor to make the committee's fairness unimpeachable (* hold that thought).

A month later (April 19, 2020) I received the committee's report and forwarded it to Jonathan Pruitt, and to all his co-authors, inviting them to respond. All co-authors confirmed receipt of the email. No co-author contested the critiques of the data, and most confirmed they agreed with the critiques.  All co-authors who responded affirmed that they agreed the committee membership was fair and exhibited no cause for concern about bias.

Jonathan Pruitt rapidly responded asking that the Laskowski et al paper, which kicked off the whole process, also be subjected to evaluation. I declined, noting that we had already completed that retraction for what I judged to be valid reasons, at the request of all coauthors including himself. More importantly, Jonathan criticized the choice of all members of the committee, claiming that all of them were biased and inappropriate to judge his data (Flo and Jeremy for instance because they had already done a lot of work judging his data and already posted findings). I repeatedly offered to add other arbiters that he might suggest (hoping he would commit to names that he would then be unable to criticize), but he never offered such names. In my personal interpretation, had he offered any names he would have then been unable to sustain the ad hominem attack strategy against the jury, and so he ignored the request.

The other main subject of discussion at this stage (April 30, 2020) was whether Jonathan could simply delete the duplicated blocks of data and re-run his analyses and publish corrections. Jonathan repeatedly (at many journals) pushed this solution. The reason for our denying this is nicely summed up by one committee member who wrote: "In my opinion a confirmed case of duplicated data calls into question the validity of the entire dataset. Simply excising the cases we’ve been able to identify does not give me confidence that the rest of the data are OK, and if this were a case in normal editorial process where I had a reviewer point out anomalies of this type I the data I would be incredibly uncomfortable letting the study see the light of day, no matter which data were censored. While I know we must reserve judgement about how these anomalies crept in, the simple fact they are present at all suggests the entire datasets in which these sequences appear are suspect" This view was unanimously supported. Moreover, we noted that the duplicated blocks of data, if a copy and paste error, must have overwritten previous data (otherwise they would have greatly inflated his sample size and been noticed as a mismatch between experimental design and dataset size). To make this really clear, if we have a string of numbers (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) there are only two ways to get duplicated blocks: 

a)  (1, 2, 3, 4, 5, 1, 2, 3, 4, 5)    - which overwrites data, so it would be inappropriate to just delete one block - which one? what numbers were overwritten?  or,

b) (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5) - which inflates the sample size. As Pruitt didn't note a sample size inflation we must infer (a) was the issue, in which case re-analysis with duplications omitted would be inappropriate.

Without the original paper data sheets to determine the correct values that were pasted over / omitted, simply deleting the duplicates is not enough because there's data that was obscured as well, that would need to be recovered. No paper data sheets have been provided at any point, despite undergraduate assistants' assertions that using paper data sheets was standard practice.


1.7 Intimidation. And now we get to the step where things became even more stressful for co-authors and editors. On February 12th I was alerted that other journals, which were actively pursuing corrections or retractions or Editorial expressions of concern, had received threatening letters from an attorney, Mr. McCann, who insisted that all journal evaluations be paused until McMaster University concluded its investigation. Note that CoPE guidelines do not, in fact, require that journals wait for University investigations - they say "institution", and the journal is an institution. Moreover, the investigation is only necessary if the case for retraction is unclear. 

I later learned that co-authors who were seeking retractions began to also receive such letters. The thinly veiled threat of legal action was enough to have a massive chilling effect, as evidenced by the three different sets of authors who had specifically told me they wanted a retraction or correction at AmNat, but who then received such letters and were not comfortable continuing the process.

A couple weeks after this (February 29th) I received a lengthy email from Jonathan Pruitt demanding answers to a series of questions that sounded to me like they were written with a lawyer's input. The questions mostly focused around the public database I had generated and allowed others to populate with information. His letter claimed that the database contained incorrect information and was modified by individuals with conflicts of interest, and did I accept responsibility for its contents. This was the first allegation I heard of any inaccuracies in the data sheet. Namely, some people had posted that retractions were in progress even though they had not been finalized by the journals in question. The delay in approving retractions was due in large part to the chilling effect of his lawyer's letters. In short, his legal actions had created the situation where the spreadsheet was not quite accurate. The contents of the spreadsheet were the collective work of many people reporting what they genuinely understood to be true. So it is clear (based on my own consultation with multiple lawyers) that what we did was defensible, based on the First Amendment protections for free speech in the US, the Speech Act, and definitions of libel and defamation. Nevertheless, I felt extremely threatened and immediately removed public access to the spreadsheet, which has remain closed since (and all requests to view it denied). Someone unknown to me did create and publish a copy without my consent, and someone else created a table of retractions on Jonathan's Wikipedia page

On April 30, while exchanging emails with Jonathan and the committee about his responses to their concerns, The Proceedings of the Royal Society B published five Editorial Expressions of Concern, noting that data concerns existed and evaluation was ongoing. Realizing that Jonathan might take time to respond to the AmNat Committee's concerns, plus time for co-authors and the committee to re-evaluate, and maybe another cycle of comments, I decided we were looking at a lengthy process ahead of us. It would be appropriate, per CoPE guidelines, to publish Expressions of Concern noting that the papers were under consideration. Basically a place-holder to alert the community pending a final decision. This is a common use of EoCs, that is approved by the Committee on Publication Ethics. Court decisions in the US have established the precedence that academic journal Editors have the right to publish such Expressions of Concern. And yes, I was reading a lot of law and court decisions in February through April 2020.  So, on May 1 2020 I emailed the University of Chicago Press, which publishes and owns The American Naturalist, with a copy of the report and a request to publish Editorial Expressions of Concern. The publisher examined the report and my proposed text, and approved this on May 8. Out of courtesy I notified Jonathan of our intent to publish the EoCs. The same day, Jonathan replied indicating that he thought EoCs made sense and were understandable, and thanking me for alerting him. A few hours later, his lawyer sent a letter to the University of Chicago Press, critiquing my conduct, my credibility, my choice of committee members to evaluate the work, and demanding I recuse myself from any further involvement. The truth is, my heart leapt at the idea of handing the whole unpleasant labor-intensive mess off to someone else, and I eagerly offered to recuse myself as requested. The Press asked me to sit tight while they thought this over.

A copy of his lawyer's letter is appended at the very end of this blog post. It claims to be confidential, but I have asked five different lawyers all of whom agree that there's no basis for such a demand. I post the letter despite its false characterization of my actions and motives. I post it without comment (though I have a detailed rebuttal I gave to the University of Chicago Press), because it need not be taken seriously enough to receive a detailed point by point rebuttal.

So, to recap: April 30th we get this letter demanding I be removed from the case, and the Press asks me to pause the process. It wasn't until August 4th that the Press confirms that I am within my rights to proceed with the evaluation process as I had planned. They recommended that I not be recused (which I grudgingly accepted), because in the Press' view I had full right as Editor to decide upon the process and outcome.

I received no further communications from Jonathan's lawyer, and only minimal direct communications from Jonathan himself after this (he and I had emailed extensively since November 2019 through April 30, often many times per day, many days in a row. The only other element I'll note is that when our process resumed, a number of co-authors, Associate Editors, and Editors (myself included) were subject to Freedom of Information Act queries asking for all emails pertaining to "Pruitt". 

If nothing else, the letters and FOI requests had a chilling effect of delaying the evaluation process. I use the term 'chilling effect' deliberately here as it is a key legal criterion for when threats and intimidation become suppression of free speech and scientific publishing, in contradiction of the Speech Act and the US First Amendment. Co-authors who had written their intent to submit Retraction text did not do so. Journals that had approved retractions put them on pause (and in so doing, rendered the Google sheet document temporarily inaccurate). But eventually chills wear off and the thaw follows.

1.8 The thaw In August, Pruitt provided rebuttals and explanations to the committee's report. This was sent by his lawyer to the University of Chicago Press, who sent it to me. The co-authors commented on those rebuttals (indicating skepticism of his rebuttal). The committee made recommendations to me based on their original evaluation, the rebuttal, and co-author comments. In all cases I followed the committee's recommendations. One was a minor alteration to a data file on Dryad that we requested. One was a correction noting suspect data in a supplement, which was immaterial to the main point of the paper (a theoretical model not parameterized by the data nor tested with the data). Two additional retractions. And, the most recent, a paired Correction (at the request of all three authors) that the Editorial committee and Editors found unconvincing, so an Expression of Concern (coauthored by the whole committee) was published alongside the Correction.

The process of these closing steps was notable in several ways. 

First, Pruitt did not acknowledge or reply to offers to sign on to any of the retraction notices, though he signed onto the Correction to Pruitt, Sih, and Stachowicz 2012. For all retractions, all authors other than Pruitt signed on, in every case (and he signed onto the first retraction made public in January 2020).

Second, we were on the verge of accepting the most recent Correction (to the paper on starfish and snails) when the journal received an anonymous submission (via Editorial Manager) of a critique of this same 2012 paper. Our investigation had not identified the same kinds of large blocks of repeated data that were the hallmarks of multiple other retractions. There were blocks of duplicates, but marginally significantly more common than null expectations, so not strong evidence of a problem. There were more minor errors, and some weird inconsistencies in rounding (far too many x.5's, not enough x.4's) that could be attributed to a careless undergraduate (as Pruitt implied), but nothing that called into question the validity of the data. But this Comment raised some new issues we had missed, with a detailed statistical analysis showing greater similarity in snail shells between replicate blocks, than could be explained by random assignment. In his Correction, Pruitt replied that snails were not actually assigned randomly to blocks (contradicting the Methods text originally published), but provided no statistical or simulation evidence that his non-random process could generate the odd data overlaps. Conversely, the anonymous commenter then showed that Pruitt's explanation is unlikely to be valid or explain the problem. The details are provided in the recent Expression of Concern. What I want to note clearly here is this: the snail size data have patterns of repeated numbers, much like previous retractions, but not in blocks. So, it would seem reasonable in this case to retract. Why didn't we? The logic is this. First, this is the only paper where the co-authors both supported Correction rather than Retraction.  Second, the patterns in the data identified by the anonymous individual were less shockingly egregious than in other cases. Together these three points still left me balanced between retraction and the Correction + Expression of Concern approach. I opted for the latter because it allowed Pruitt to have his own voice in a public response, but for us to also clearly and publicly evaluate the claims that he makes. Personally, I do feel that retraction would be warranted for this paper, but that the Correction and EoC approach had its advantages as well, allowing the authors to make their case and still allow editorial rebuttal. Third, the Committee on Publication Ethics (rightly or wrongly) suggests that retractions are warranted when core conclusions are affected. In this case the snail size data was ancillary to the main point (snail behavior interacts with predator behavior;  snail size was not in fact under selection nor was that selection contingent on starfish behavior).

The final point is an essential one. One of the papers was a mathematical model inspired by some data hidden away in a supplement. The data were not used to choose parameter values, or anything formal. But, the data exhibited many of the same kinds of problems we've seen already. So the authors (Pinter-Wollman et al) wished to note their mistrust of the empirical data, but continued support for the core focus and goals and findings of the paper. This is a great example of where the flaws are secondary to the focus of the paper, to the point where a Correction seemed like a reasonable route and in keeping with CoPE recommendations. However, the day that the Correction was published, we were notified that the empirical data invoked in this paper (ostensibly about a species of spider, Stegodyphus dumicolawere collected in 2014 in the Kalahari) were in large part identical to data from a Behavioral Ecology paper (Keiser et al) that described the same numbers as coming from two other species of spiders in Tennessee in 2010 (Theriodon murarium and Larinioides cornutus). It thus is plain that data were duplicated across studies and repurposed for different biological settings. Whether this was intentional or a result of carelessness, I again cannot say. But, in my own personal view this is clearly malpractice with data whether it is intentional or careless. The question then is whether we retract a valid mathematical model, out of guilt-by-association with tainted data, in order to punish (since it is not just a question of correcting an error - the mathematical model is itself self-consistent and valid). In my view it is not the role of editors to punish, but to act to ensure high quality of published work. The process of punishment is the purview of the employer of the scientist responsible for malpractice.

In parallel to all this, I was proceeding with a process as a co-author of a Proceedings of the Royal Society paper. Our initial investigations into the paper in question (on 'behavioral hypervolumes') didn't reveal any evidence of serious flaws, and we were close to signing off on a minor Correction. But, a series of observations raised new concerns. Namely, for a set of observations in the study, it appeared likely the numbers were typed in a non-random way. If you have a laptop keyboard, the numbers are arranged from 1 on the left to 9 on the right in a single row. When typing in numbers "at random" people readily type in adjacent numbers, or certain ending numbers, more often than expected. In this dataset I observed certain combinations were vastly over-represented.For example, numbers ending in 78 (adjacent keys) were far more common than numbers ending in 77 or 79. The same is true for 67 (relative to 66 or 68), and for almost all adjacent pairs of numbers. I can think of no biological basis why times on a stopwatch should fall into those clusters, and so the co-authors and I (except Jonathan) asked to be removed from the paper when the journal decided to request a Correction from Jonathan.


2. Major lessons learned:

First, one lesson is that this was an immensely long process generating vast numbers of emails, R code files, images of data. And it feels very cathartic to get the experience written down here. So thanks for reading. But the real lessons as I see them are:

2.1. The central role of good data sharing. The journals that required data archives were vastly better able to evaluate the data and suspicions, compared to journals that didn't require archiving. All journals should require this. And, we also found that quite a few data archives were incomplete, highlighting the need for better enforcement of compliance - good meta-data, all relevant variables included.

2.2.  Even with data sharing, we can't detect all forms of fraud or error. Although there were some recurrent themes (e.g., blocks of data repeated), this isn't something we normally check for when your colleague emails you data to publish. People had to build new software in R to detect the problems that were first noticed (by Kate Laskowski) by eye. Sometimes it was terminal digit analysis (like the 78 repeats I just noted), sometimes it was excessive overlap of numbers between mesocosms. There are an infinite number of ways to introduce  or create errors in data, by accident or intent, and we just can't catch them all.

2.3 The importance of coordination between journals. The journals' Editors were super careful to not bias each others evaluations of papers. But discussions were essential to learn best practices from each other, such as suitable use of Expressions of Concern, how to  set up committees to evaluate concerns. This was a new experience for almost all of us, and so having a community of peers to discuss due process was valuable. But even more crucially, each of us might not have known what to even look for, without some indications to each other about what we found. This is particularly evident from the more recently emerging evidence that some data sets are duplicated across papers in different journals, ostensibly about different species of spiders on different continents. This recycling of data is blatant (though from an abundance of caution I'll say again I don't know if it was intentional), and can only ever be detected by coordination among journals and comparisons across papers. Thus, a collaborative process between journals is not only helpful, it is crucial. Note added later: journals use iThenticate to cross-check prose for plagiarism from other papers. Can we do the same for data? Of course some data recycling is entirely appropriate when asking different questions and acknowledging the intersection between papers. But some is clearly done to mislead.

2.4 Should we be Bayesians about misconduct? Thoughout, we sought to treat each paper in isolation. But many colleagues object saying that we should be updating our priors, so to speak - with each additional example of errors in data we should grow more skeptical about the validity of as-yet-untarnished datasets by the same author. That's a defensible position, in my personal opinion, but I went against my own conscience in trying to not be Bayesian here, to make the process as objective as possible. The simplest reason is that a fair number of his papers were based on data generated by others. We absolutely should not leap to the conclusion that everyone he collaborated with was equally problematic in their data management practices. Having said this, it is absolutely pertinent that there is a repeated pattern across many papers and journals. If there are duplicated blocks of data in one, and only one, dataset, I can readily ascribe that to a transcription or copy-and-paste error. If most datasets have such errors, accident seems highly improbable and the case for systematic intentional fraud becomes ever stronger. But even if the systemic errors are unconscious (e.g., difficulty copying data from paper into a spreadsheet due to a cognitive disability), as a community we cannot trust the work done by someone who is systematically generating flawed data.

2.5 Why are manuscripts guilty until proven innocent when we review them before publication, but innocent until proven guilty when it comes to flaws and retraction elsewhere? The simple answer is that the impact on individuals' lives is asymmetric. Reject a manuscript, and it gets revised and published elsewhere. Retraction has massive negative effects on someone's psyche and career and reputation. Because the personal and professional impacts are asymmetric, the standards of evidence to make decisions are similarly asymmetrical. Now, there's another approach that might be better. If we destigmatize retraction (while retaining the stigma for fraud & misconduct), we make it easier for people to retract or correct honest mistakes. The result is an improved scientific literature, when retractions become an encouraged norm when warranted. Again, see my recent blog post about the philosophy of retraction for more detail.

2.6 Minimizing collateral damage. During a process such as this, co-authors' work comes under scrutiny as well because any paper with the central individual as a co-author is questioned. This is especially true in this case, where Pruitt had an established and self-acknowledged habit of providing data to others, for them to analyze and write up. The online database served first and foremost to 'rescue' the reputation of papers that (i) were from data collected and analyzed and written by other individuals, or (ii) were theory or reviews that did not entail data, or (iii) were cleared of errors through the investigation process. The primary hope of everyone involved was to find as many papers as possible that could be placed into these categories, to retain the reputation of as many papers and authors as possible and minimize collateral damage (and, at first, damage to Pruitt as well). This is why co-authors eagerly contributed to the database, and added retractions as they requested them (not realizing that a requested retraction might then be delayed or denied by the journal due to the chilling effect of lawyer's letters). But on balance the value of the database was primarily to encourage continued citations of papers that were untouched by the data problems. The removal of the database from the public eye, at Pruitt's demand, exacerbated the collateral damage to his coauthors. I regularly received emails asking for access to the database, which I denied out of fear. Often those emails involved a request for help in judging whether a paper could be safely cited, and I felt like the spurious and unfounded legal threats against me obligated me to be unhelpful. So, I would reply that the researchers needed to come to their own conclusions about citing the paper in question. I deeply regret that I wasn't more proactively helpful, for a period of time, in supporting citations to papers that remain sound science. Even to this day, I think there is no resource where researchers can go to check to see if a paper is judged to be okay to cite, they can only find the negatives (the retractions or Corrections). The public and finalized retractions are listed on his Wikipedia page. Knowing that some journals are still conducting evaluations, this one-sided information only serves to harm his co-authors.

2.7. Think about mental health of authors. Retraction is stressful, and might induce depression or worse. Conversely, we can't let authors hold publication decisions (including retraction) hostage with threats of self-harm. This is a tough tension to resolve.  

2.8 Editors will sleep better at night if they buy liability insurance. The letters from Pruitt's lawyers were remarkably effective at generating stress among many editors, slowing or stopping actions by journals and by co-authors. As noted above, I had received confirmation from three sets of co-authors that they wished to request retraction on the basis of concerns about data that they identified, and/or were identified by Associate Editors of AmNat, or third parties. After receiving lawyers' letters, none of those authors felt safe to actually write the retraction statements, and we received none until the journal had completed its investigation process in the fall of 2020. Even within the journal, the lawyer's letter (provided in full below) caused a pause on all deliberations from early April through early August. This is what is known as a "chilling effect", and is a topic with lengthy legal opinions protecting Editors and scientists' decisions and actions in the face of legal threats. But, as most of us scientists are not legal experts, it is extraordinarily stressful to be looking down the barrel of a potentially costly lawsuit, even when one is fully confident that the scientific facts are on one's side. I talked to lawyers in private, at the University of Chicago Press, and at the University of Connecticut, and all were confident that the threats had no teeth, but it still kept me up at night. When it did so, I only had to crack open some of the Dryad files and examine the patterns in the data to reassure myself that the evidence of scientific error and biological implausibility was clear and incontrovertible, and thus the actions and statements I made were correct.

2.9  Public statements. A retraction or correction that is done quietly, has no impact on people's beliefs about published results. It is essential that when a prominent paper is retracted or corrected, that this action be publicized widely enough for the community to be aware. This publicity is essential because it serves to make people aware about changes in what we understand to be scientifically valid, changes in our understanding of biology (e.g., removing a case study from the buffet of examples of a concept). The purpose of the publicity is not to harm the author involved. Far from it, in my experiences when authors are proactive about publicizing their own corrections or retractions, they receive adulation and respect from the community for their transparency and honesty (e.g., Kate Laskowski). A public process of disseminating information about corrections or retractions only becomes harmful when it is clear that the changes stem from gross scientific error that should have been readily avoidable, or from fraud or other misconduct. Or, when it is clear that the author fights retraction tooth and nail to create a chilling effect. In this case, it is the authors' own actions that are the source of the harm, and the dissemination of information about retractions is a service to the scientific community to correct erroneous knowledge arising from the authors improper actions.

2.10  Be patient. When you submit a complaint to a journal, there are many steps we go through to ensure a fair and correct outcome. We screen the initial complaint, and if it seems valid we assemble a committee to evaluate it. We obtain a reply from the authors. Sometimes we do so separately if the authors don't see eye to eye, sometimes as a group. If the authors disagree with the critique, we send the critique and the rebuttal to review by experts who know the statistics, or biology, well enough to give a detailed evaluation. We then synthesize the reviews and critique and rebuttal to formulate a decision. Some journals did many rounds of back-and-forth with the author in question. Note also that when an author is facing criticism on many fronts (dozens of papers at multiple journals), they aren't going to be fast about any one paper. This is where Editorial Expressions of Concern (which I sought to publish, but was blocked by legal threats spooking my publisher) come into play - they can alert the community that an evaluation is underway, early on, giving breathing room to do a thorough and fair evaluation. PubPeer also serves the role of early notification to the community. But in the particular case of Pruitt's papers, some PubPeer posts were later found to be incorrect. Leveling incorrect accusations in a non-peer-reviewed venue troubles me, which is why I prefer the slower but more thorough review process inside a journal.

Above all else, I believe that science requires open discussion of data, and clear documentation of due process, and dissemination of findings. We have followed due process, and reached findings that resulted in author-requested retractions for three papers (with full agreement of the entire Editorial board of 3 editors, the journal, the 6-person committee of Associate Editors, and all but one author, in each case). Two other papers have received Corrections, and one of these also has an Expression of Concern. Now that the back-room deliberations are complete, in the spirit of scientific openness about process, I felt it was time to clearly and publicly explain the logic and process of my involvement in this series of events. As a community we can only learn to (1) prevent and detect such cases, and (2) adjust our understanding of biology, and (3) improve procedures for future cases, when the details of the events are clearly known.

Coda

This may not be the end of this story, though I hope sincerely that it is the end for me. Investigations are ongoing at other journals, and at institutions. But on balance, my task is done, as both Editor and co-author. The threat of legal action still hovers, and I worry about posting this blog stirring that hornet's nest. But with each new retraction at another journal, arrived at independently through processes outside my control and without my influence, the evidence grows that there was a deep and pervasive problem. Should this ever wind up in court, it is easy to point to the data and make it clear that there was a strong biological and statistical rationale to doubt the data in these papers. We've bent over backwards to pursue a fair and equitable process, treating each paper separately, and bringing on advisers who are if anything likely to be on Jonathan's side or neutral arbiters. We have coordinated between journals, because that's essential to learn from each other and detect problems that cross journals (e.g, data reused for multiple papers ostensibly about different species). In short, I've learned a great deal about effective ways of processing these kinds of problems. And I've seen journals that performed admirably, and journals that didn't (yet).


Acknowledgements

This post is dedicated to the committee members who assisted The American Naturalist with its investigation - Jeremy Fox, Florence Débarre, Jay Stachowicz, Alex Jordan, Steve Phelps, Emma Dietrich, and to the many co-authors who assiduously worked to evaluate concerns about data in the face of intimidation.


Supplement

As a supplement to this document, I am providing a copy of Pruitt's lawyer's letter. I am providing it without comments, though nearly every paragraph contains statements that are demonstrably false, or misrepresentations, which I can prove with emails as needed. Just to pick a couple of examples, at no point had I "contacted the editors of more than 20 academic journals to ask them to investigate Dr. Pruitt " - they received whistleblower complaints from someone else and I had no role at all in that. Many of them then emailed me to ask what my procedures had been for responding. I also was not involved in "guiding [Laskowski] through the analysis that led her to conclude that the paper should be retracted" - she did that on her own, after concerns were raised by Erik Postma and Niels Dingemanse, with zero input from me about the analysis. Such errors are riddled throughout the letter, which casts aspersions on me, my motives, the committee that served the journal to evaluate his work, and many others. So, please read the following with an appropriate level of skepticism as to its contents. Also, I should state up front that multiple lawyers confirmed for me that all of my actions are appropriate, ethical, and protected under the US First Amendment free speech clause and the Speech Act, and that the request for confidentiality at the top of the letter has no legal basis. 






















Monday, May 10, 2021

Wish to register a complaint

The following is a guest post by Ken Thompson.

Recently I took the unusual step of publicly posting a technical comment about my own first-authored published work. Why not just publish a correction or a retraction? This warrants an explanation and I hope to provide that with this post.

 

The study’s premise was simple: identify species in the same plots using either (a) trained botanists identifying plants with morphology or (b) untrained workers who collect tissue to be sent back to the lab for molecular identification using DNA barcodes. My co-author, Dr. Steven Newmaster, provided me the data which consisted of (a) species identifications for both survey methods and (b) environmental data about each plot (e.g., aspect, canopy cover). I never saw any of the raw data or the DNA sequences, and the work was done when I was an undergraduate.

 

I have had concerns for a while but was too afraid to say anything. Related recent events motivated me to get to the bottom of it. I reached out folks who are familiar with COPE (Committee on Publication Ethics) guidelines (who I won’t name but you can name yourselves if you wish) and they advised me to reach out to the University of Guelph to investigate. 

 

I did just that in February of 2020, and after launching an inquiry --- which took approximately eight months when it should have been less than one --- the University of Guelph declined to investigate further. Even as I transmitted additional information, they continued to see no reason to investigate further. They did not provide any justification, nor did they speak to me at any point. 

 

I then went to the journal (Biodiversity and Conservation) and asked them to investigate. After several months, the Springer Nature Research Integrity Group decided that it was not their responsibility since the University of Guelph had already drawn a conclusion.

 

Having come up short with both the University of Guelph and the journal, I feel it is now prudent to share details publicly. I submitted a technical Comment on bioRxiv as a pre-print to publicly discuss in detail my issues with the data, and at the same time submitted the Comment to the journal for formal review. BioRxiv declined (as a matter of policy) to post a Comment on another paper, so I have posted it on my own platform.

 

 My technical comment makes four points which I will briefly summarise here. First, although we claim the study was done at a particular site in Timmins, ON, evidence I gathered suggests that the data are from sites that are over 500 km away. Second, we did not archive the molecular data as we claimed in the paper.

 

Before listing points three and four there is one additional important piece of information. After being alerted that I was looking into the data, my co-author began uploading thousands of DNA sequences to GenBank that are associated with the paper and my name. These data are key to points three and four.

 

The third issue I outline in my technical comment is that, using the data recently uploaded to GenBank, I cannot reproduce some of the key results of the paper (i.e., distinguishing congeneric species). Finally, I show that the recently uploaded GenBank data is unexpectedly similar to the data uploaded from an independent study at the University of Guelph.

 

I have submitted the technical comment to Biodiversity & Conservation for consideration too, hopefully to force them to respond substantively to my critiques. I have also formally requested an authorship removal retraction because I can no longer stand by the results of the study.

 

All the data are now public, linked via the pre-print. I invite and encourage anybody who is interested to have a look for themselves.

 

So, why go public? Doing this alone behind the scenes has been incredibly isolating. I don’t want to deal with this alone anymore and hope that by sharing an evidence-based critique of our paper some people will choose to support me here. Ultimately, I want to arrive at the truth. I am convinced that publicly is now the best way to do this. Let me also take this opportunity to say that I am not accusing anybody of anything untoward.


I do have another reason for going public. I truly feel that the University of Guelph and Springer Nature have failed to uphold their standards of research integrity. I believe that the evidence presented above is sufficiently serious that any responsible body would find it prudent to investigate. 

 

Ken A. Thompson 

Sunday, May 9, 2021

On retraction

Prologue: The subject of scientific retraction has been very much on my mind in the past year, as an editor, a co-author, and a member of the scientific community watching retractions (or demands for retraction) play out at other journals. Most recently, I had a stimulating Oxford style debate with Dr. Elisabeth Bik, for the annual meeting of the Council of Science Editors, which had me thinking at length about the procedures we follow and decisions we make. This culminated over the weekend in a stimulating conversation about science, law, and ethics of retraction while on a hike with a friend who is a lawyer. Inspired by that conversation, by the past year and a half, and by my own experience retracting a paper, I feel driven to share some of my personal perspective about scientific retraction. I want to note that in this essay I am expressing my personal opinions. These opinions do not necessarily reflect my behavior or decisions as Editor: in that role I am bound by traditions and procedures that may differ from my personal preferences. I also want to emphasize that I will be speaking in broad generalities of ideas in this essay, and not about particular cases. The following text is meant to be thought-provoking, stimulate debate, and is not a definitive statement of editorial policy.
    - Dan Bolnick

Let's begin by considering what retraction is for. They say science is 'self-correcting', meaning new findings get published that may contradict and eventually eclipse old misconceptions. So why do we issue retractions at all, when we could just let new papers lead us forward? I see two distinct motives (there may be more, but I think these are the two major principal component axes, so to speak).

1.  Removing information from the public record that we now know to be false. Sure, new papers may later get published that correct our view. Self-correcting science and all that. But, like an old land mine from a forgotten war, the incorrect paper is still there for some unwitting reader to stumble across and, not knowing about the newer work, cite and believe.




2. Punishment for misdeeds. The retraction itself may be viewed as a form of public censure, a slap on the wrist.


I want to do a bit of a deep dive into each of these, because each seems simple at first glance, but has complexities and caveats once we delve into them. To start, it helps to recognize that we can have concerns about papers for many distinct reasons. Sure, there are obvious and egregious cases of fraud where the data and conclusions are fundamentally false - duplicated or altered images that are central to the conclusions of the paper, for instance. But it can be more subtle than that, so let's do something analogous to the classic table we all learn about type I and type II error in statistics (false positive, false negative). The rows of are table concern validity of the conclusions: a paper can report results that are factually correct, debatable, or incorrect. The columns are about the evidence, which can be rigorous, flawed/low quality, or unethical in part, or fraudulent at heart.

Conclusions are:

Rigorous evidence

Flawed evidence

Unethical in part

Fraudulent at heart

Valid

Great paper!

Right, but for the wrong reasons

Fruit of the poisoned tree

Falsifying to reach a true inference

Dubious

Over-reaching

Debatable & unclear

Bad scientist!

Why bother?

False

False positives happen.

In good faith.

Honest mistake. Luckily, science is ‘self-correcting’

Really bad scientist!

Falsifying to reach a false inference


I won't take the time to consider every single permutation, but let's start with a case study. On my hike yesterday, my lawyer friend asked whether Charles Darwin's Origin of Species was correct on all points. Clearly not, I responded. In particular, Darwin had a deeply incorrect understanding of genetics, that led him astray on a few points. We know better now. Science being self-correcting and all that. So, my friend pressed, why don't we 'retract' Origin of Species (nevermind that it wasn't published in a journal)? It contains errors. We know that unambiguously. But it is correct in the main, and of historical value.

Let's proceed to a trickier one - Wilhelm Johanssen published a paper in The American Naturalist in 1911 expounding the genotype concept of heredity, a revolutionary paper that defined 'genotype' and (to quote a recent historical paper :"the theory must be recognized as a creation that provided the theoretical foundations or the framework for the upcoming reductionist material science of genetics, quite beyond the perception of its instigator." But, towards the end of the foundational paper Johanssen makes fun (rather harshly) of the notion that genes have anything to do with chromosomes. 




So, now we have a classic paper in a journal (hence amenable to retraction, unlike Origin of Species) that makes a false claim. Should we retract it? Well, my first instinct is that it is old and part of the historical record. We make mistakes in science, and we advance beyond those mistakes, and it is valuable to historians of science to leave our tracks in the sand as we stumble towards (the mirage of?) some truth. There's a legal idea of a statute of limitations - a time period beyond which someone cannot be charged with an old crime. The time frame varies with the severity of the crime (in the eyes of the law), and some have no such limit. So is there a statute of limitation on being wrong? In that case, how old must a paper be to be retraction-proof when proved wrong? Nested Clade Analysis was wildly popular among phylogeographers when I was a beginning graduate student, but we now know the emperor had no clothes, and nobody uses it anymore (I hope). Is that old enough to escape retraction? How old is too old to bother? The answer depends on the severity of the crime, the culture and nation. It also matters a great deal whether the action was illegal at the time it was committed. Today the standard is to judge past actions by the contemporary rules of their era.

Or  is it not about age, but intent? Do we tolerate good-faith error? Johanssen had his reasons, and reached a reasonable and honest conclusion by the standards of the day with the evidence available to him at the time, and we should not judge him harshly for not knowing what we know now. Templeton was sincere in his desire to reach phylogeographic inferences and at the time we lacked the toolkit for approximate bayesian computation (ABC) methods that can test complex biogeographic genetic models. He meant well, he did the best he could with the tools at the time, and if it didn't work as well as he thought, it was honest and well-meaning, and so no retraction is needed. As an even more obvious case, I might set out to test a null hypothesis with a statistical significance threshold of alpha = 0.05. Like, maybe I want to know whether drinking carbonated water cures COVID (hint - it won't though luckily its harmless), the null is that carbonated water has no effect on COVID. And let's say the null is genuinely true (carbonated water won't fix COVID). I do my experiment, and by sheer bad luck I am one of the 5% of such studies that find a P < 0.05, and I publish a paper rejecting the null. I'm wrong, but I followed standard procedures. Should that fictional example result in a retraction following other studies? Do we generally retract all significant results when subsequent more powerful studies reach the contrary conclusion? Or vice versa? Currently I think the standard is very much that errors made in good faith don't warrant penalties at least. If they are recent, then a correction is certainly called for. Maybe even a retraction if the core conclusion is wrong. 

I did a retraction for this reason myself - a reader noted he couldn't reproduce a result with my data. I looked back at R code I wrote in 2007 while first learning R, and found a one line error based on a misunderstanding of how tails of probability distributions were being reported. It affected the core result. So even though it was in good faith, I retracted the paper (this was a few years ago). Knowing what I know now, I might have urged my past self to a less knee-jerk extreme reaction. I was driven by emotional horror. What I ought to have done was to publish a Correction. The data are still factually valid. The question is still interesting. It's just that a positive result (morphological differences between individuals affects diet overlap between them) is now a negative result (no, it doesn't). That's still a useful result that I would cite, just for different reasons. This falls into the second column bottom row - I had published flawed evidence for something that was false, and I self-reported the flaw.

There's a related tricky issue particular to theory papers. I'm not aware of many cases (any) of fabricated algebraic solutions or model results. But there are plenty of cases where authors made honest mistakes in their math, stating somewhere that x = y when in truth x != y. These can be simple typos made during writing (not analysis). Or, they can be part of the calculations and, when changed might have little to no impact on the paper's conclusions, or could radically alter them. The former clearly merit correction to the equation, the latter teeter between correction and retraction, depending on the severity of the error. But again, intent matters here. It is easy for math to be demonstrably wrong and to prove errors, compared to data analysis and experimental designs that can be critiqued but are more a matter of judgement. Yet we don't often retract math errors because corrections suffice and errors made in good faith are typically treated more leniently. 

So on the whole, I lean towards the personal opinion that good-faith errors, whether a coding problem or a random false positive or false negative (they happen!), are best corrected as science builds on past results, rather than penalized. Positive results might be corrected into Negative results. And as a field we want that to be made public freely and openly. 

We want to encourage self-correction. When Corrections (or even Retractions) have a powerful negative connotation, scientists who find (or are told of) an error in their own work will naturally hesitate to Correct or Retract. I sure stressed about it. I shouldn't have. When I retracted and publicly explained why, I received incredible positive support from the community. We want science to be self-correcting, to more rapidly reach true inferences. If we penalize self-correction, if we put an onus on it, it will happen less. People respond to incentives. If you want something to be done, reward those who do it, not punish them.

Now let's delve into a grey area: sometimes scientists collect data and reach a conclusion that is actually valid in real life. But, the evidence that they use to support their conclusion is flawed. I don't have a specific case in mind, but hypothetically imagine a researcher sequences a population of fish to look for the presence of a particular allele. It turns out their sample was contaminated, or swapped for another sample, so they aren't sequencing what they think they are.  But by happy (?) chance, the contamination or wrong sample actually has the allele, as does the population they think they are sequencing. So they correctly conclude that the allele is present, but they do so for a wrong reason. This is maybe especially likely in ancient DNA research, where cross-contamination (and absence of DNA in focal samples) is a common problem. I like to think of this hypothetical case because it highlights the role of truth / falsehood in our judgements. As an editor or author, I would be very tempted to retract such a paper, and at a minimum a correction is needed. Their published report cannot in fact prove what they claim, because the evidence marshaled for it is fundamentally irrelevant or incorrect. Yet, this is done entirely in good faith, and the outcome is factually true.  This is the "Right but for the wrong reasons" cell in the grid above. So retraction isn't simply about being correct or incorrect.  Nor is it simply about malicious fraud / conscious misleading.

Just below this in the "Flawed evidence column" I put "debatable". This is for cases where scientists are maybe tackling a difficult subject and the data are simply not definitive. It may be hard to collect good data and the results may be ambiguous. The end result might look promising to one person (who favors the idea being tested) but sloppy and unconvincing to a skeptic. This is the stuff that scientific debates are made of. My favorite these days is the unnecessarily vitriolic debate over evolutionary neutral (or nearly neutral) theory. There are strongly held differences of opinion and each side genuinely believes the other is deeply wrong. This is a scientific work in progress,  that  may be resolved with further data or theory. It is definitely not the journal's job to retract something that is in this debatable category, even if it may eventually prove to be wrong as agreed upon by all parties in the future. We are a platform for vigorous informed debate, not partisans in it. And when the debate is eventually settled, we keep the record of that process, rather than deleting the losing side via sweeping retractions.

So to recap mid-way, I started by saying one reason to retract is "Removing information from the public record that we now know to be false." That's appealing and obvious. Except that we don't usually retract things that we now know to be false, if they are old, or if the error was arrived at in good faith. And sometimes we might retract things that are actually true, if they were arrived at for the wrong reasons. 

Let's explore this latter point some more, by considering true conclusions reached in bad faith. What if a published paper reports a true conclusion, with rigorous valid data and large sample sizes (sounds great, right?), but having obtained the true data through unethical means. Maybe they didn't have a collecting permit to sample the focal species, or do the experiment in the field. Maybe they didn't have IACUC (animal care) approval, or IRB (human subjects) permission? They smuggled permitted-collections out of the country? We are moving into the "Unethical" column of the table above. Scientists sadly are sometimes overeager to do their work and adopt unacceptable methods. What do we do in this case?  As an Editor (and in general as a member of the scientific community), I think a paper should not be accepted for publication if it didn't follow proper legal and ethical procedures. Let's go with a problematic hypothetical example. A researcher discovers a miracle cure for cancer. And in their eagerness to test it they skip IRB approval, they skip informed consent, they administer it to unknowing, non-consenting individuals... and they cure their cancer. Should that paper be published? It is tempting to say, hell yes we want a cure for cancer. But we got there through procedures we cannot condone. What to do? One might be tempted to keep the paper and result, but levy some penalty on the researcher. But ultimately, it is the duty of a journal to retract papers that are arrived at via unethical means, regardless of the validity of the outcome. That miracle cure for cancer? It'll have to wait for someone else to replicate the initial study via ethical and permissible means. That means a delay in treating people with cancer, and loss of life, which feels deeply wrong. 

(Note afterwards - one twitter reader noted that if the work was done unethically and in secret do we really trust the results to be true? A fair question, but for the purpose of exploring this ethical conundrum please suspend disbelief for a second and accept that in this fictional situation the actual scientific conclusion is documented rigorously enough that we can trust its veracity).

(Note afterwards - thanks to @SocialImpurity for pointing out this real life case that is relevant to this ethical dilemma)

There's a fascinating analogy here. In US courts, there is a legal metaphor known as "The fruit of the poisoned tree". (Note, this is derived from British Common Law, and is not a universal doctrine globally). Say there is a burglary. A cop catches an accomplice, and in order to find the ringleader of the gang, tortures the accomplice (that's not ok) into revealing the location of the getaway car. Finding the getaway car, which matches the security videos, and has fingerprints all over it and the money hidden under the seat, the police identify, capture, and prosecute the ringleader. The good evidence of guilt (the fingerprints on the car with the money) is the product of a bad process (torture, the poisoned tree). And the fruit of the poisoned tree is inadmissible in court. Here's the bit that makes many people uncomfortable: the thieves go free. Not because they were innocent. They absolutely were guilty. But because the evidence of their guilt is the result of a flawed process. So we throw it all out. Much like we would throw out all scientific results (the fruit being the miracle cure) that arise from the poisoned tree (the IRB human research violations). The motive in both cases is to deter bad behavior (by scientists, or police) in pursuit of a good outcome. The consequence is that good outcomes (cancer cure, convicting the guilty) are set aside fully knowing they are good outcomes. Now, I think many of us get uncomfortable at this result, and would argue that the guilty criminals should be convicted and the police punished. But, that's not the American & British legal tradition. I should note that the Fruit of the Poisoned Tree metaphor is strictly about police behavior, and has no legal teeth that extend into other areas like scientific publishing.

Note that the fruit of the poisoned tree doctrine will probably make you squirm. I do. I'd love to see that miracle cure made available. But what ethical boundaries are you willing to waive? Experiments on people without consent. That's bad. But do the ends justify the means for you? That's the logic (I presume) that drove the syphilis experiments on the Tuskegee Airmen, or Dr. Mengele's horrific experiments on Jewish concentration camp prisoners. Are you willing to use the results of those experiments, willing to publish them, or let them remain on the books? I'm not.

But these are extreme cases. Let's turn the dial of moral outrage down, and likewise adjust the slider bar away from "cancer miracle cure" towards something more realistic. Would you publish (or, retract if already published) a paper that used stolen archaeological artifacts? Genome sequence data taken from non-consenting individuals unaware of what their drawn blood was to be used for? Currently the expectation is that Editors like myself would reject papers that used ill-gotten samples to reach a conclusion, regardless of the validity of the conclusion. And that standard makes no reference to how important the conclusions are (e.g., an interesting intellectual advance, versus the miracle cure, both are equally subject to this standard).

So, hopefully we are in agreement that a scientific result (whether true or not) arrived at through unethical means will get rejected (and maybe the experiment redone properly and ethically, if perhaps more slowly). But let's unpack this some more and ask what the statute of limitations is here. How far back does the principle apply? Institutional Review Boards for human subjects research began in 1974 (see this article for a useful history). That means no research before 1974 (the year I was born, incidentally) was IRB approved. Today we require IRB, so does that mean all journals should retract human subjects research from before 1974? Clearly we don't do that. The key difference is that the pre-1974 papers were adhering to the ethical standards of their day.

There are other kinds of unethical behavior that get caught up in scientific publishing. What constitutes retraction-worthy unethical behavior? What if the communicating author(s) left someone off the author list who contributed significant work that would normally be sufficient to warrant co-authorship. That's not cool, its not ethical, but is it retraction worthy? What if they added an author who really did nothing at all, simply to gain credibility by association, or as a bribe for some quid pro quo? Unethical, in both cases. But retraction worthy? Probably not, partly because it can be hard for editors to effectively adjudicate who did or did not earn the right to be listed as an author. And partly because there are other simpler remedies - adding an author, or publishing a statement removing an author.

To push this line of questioning into an even more uncomfortable realm, what if the lead or senior author on a manuscript, or a published paper, was guilty of sexual assault? Does unethical behavior that is entirely decoupled from the paper in question, impact the editorial board's view of a scientific text and conclusion? Should we decline to publish, or retract, papers by individuals whose behavior the Editor finds abhorrent? Publishing their paper promotes their name and role in the scientific community, and we might not want to do that. But, the technical conclusions of the paper are (in this hypothetical case) entirely unaffected by the authors' behavior. If we open the door to editorial decisions based on abhorrent behavior, who gets to determine that criterion? Something that is abhorrent and unethical to one person might be celebrated by another (not sexual assault mind you, but other forms of sexual behavior, or political or religious belief perhaps). How much evidence must be given to the journal's Editor to reach an informed decision about whether or not the abhorrent behavior took place? This particular issue ties me in knots, and I confess I don't know what I would do in such a situation, and am simply glad that it hasn't come up in my role as Editor handling new paper submissions, to my knowledge (I wouldn't be at all surprised if some current or past authors have been guilty of such offenses). What would you do? 

Speaking of abhorrent, what about racist and eugenicist research of the past? The American Naturalist has plenty of articles by Charles and Gertrude Davenport, leaders at the Cold Springs Harbor lab for eugenics. Should we retract their papers from the 1900's through 1940's? Or, one of the first women to publish in The American Naturalist wrote a horribly racist rant that is offensive enough that, when I quoted from it in a lecture (to drive home the point forcefully and openly that AmNat published awfully racist stuff), a audience member subsequently admonished me for showing it. Should I go back and retract that paper, which I suppose I have the power to do as Editor? I wouldn't, simply because I think that would be whitewashing an ugly past that we are better off acknowledging and confronting openly rather than making it as if it never existed.

We are almost done, I swear. But I'm having fun posing these questions.

The last column in my table, above, concerns downright fraud.   Image manipulation. Data alteration or fabrication. Changes to data or misreporting results with the intent to mislead. The italicized bit can be hard to prove without mind-reading, so often we content ourselves with evaluating whether data or an image are biologically plausible. If not, the results are false and data are not trustworthy regardless of intent. In effect, we say that (not knowing an authors intentions or actions) a paper could either be in the right hand column (fraud) or the bottom row (falsehood), and either one could be retracted. In recent cases, decisions to retract reflected patterns in data that appear to be fraudulent, but we cannot with certainty discern whether there was bad intent, or simply large-scale and recurrent accidents in data management. The latter seems deeply unlikely, but from the standpoint of journal decisions (retract or not) the distinction is irrelevant.

So to recap this second part of my musings, I started by saying the second reason to retract is "Punishment for misdeeds." Certainly the 'Fruit of the Poisoned Tree" doctrine I described represents a punishment, because the scientific conclusions may be really true and valuable yet we'd still retract - as a penalty and as a deterrent to future scientists considering the same misbehavior to get data. As editors and a community, we would walk away from true knowledge if it was ill-gained. But I expressed discomfort at the thought of using retractions as a stick to punish some other forms of unethical behavior. And actual fraud we might argue that the retraction is for biological falsehood (or lack of trustworthy evidence at least), rather than as punishment per se.

Note that I'm absolutely not saying that misbehavior should go unpunished. Sexual assault, for instance, should be prosecuted in court. Sexual harassment as well, or at the very least pursued seriously and in good faith by the university or institution where the harassment took place. What I am saying though is that punishment may not be the journal's primary role. The journal serves to communicate information, and (to the best of our abilities) to check that the information is good. We can decide to stop communicating something that we no longer trust. But our capacity for punishment is very limited. At most, I could retract papers from an author who committed systematic fraud. But I'd check each paper separately rather than assume the fraud was universal. Perhaps I could refuse to consider any further submissions from that author. But what if they learned their lesson and now pursued an honest and careful path to new science? Should we forgive and forget (and verify)? Ultimately, journals only have the ability to punish using the limited tools at their disposal - the papers they publish. It is the employer (the university, institute) that has the power to fire, or reassign duties away from research to teaching, or put on unpaid leave, etc. The employer provides money, and money has a leverage the rest of us lack.

There's another reason why I generally am wary of using retraction as a form of punishment. I noted earlier that we want to encourage people to self-correct - that helps clean up our literature faster. When retractions are systematically equated with punishment, authors who find errors will be more hesitant to self-retract. One solution would be a linguistic one - if we adopted two different terms for two different kinds of retraction. Honorable self-correcting retraction, and dishonorable retraction for fraud or unethical behavior. Call them subtraction and detraction, respectively (I'm open to alternatives). But I think it is crucial that we punish, and crucial also that we reward honest self-correction. These need to be kept separate in the sociology of science.
   
So, back to my original premise: when is it appropriate to retract?
Are we removing information from the public record that we now know to be false? Are we punishing? The answer is subtle, because we don't want to conflate punishment with retraction. Because journals have the power to retract but limited power to punish, which is better done by employers (who journals can and should notify). Because we sometimes retract things that are true (fruit of the poisoned tree) but often don't retract things that are false (in good faith, or old). 

This subtlety is made even more challenging by a whole other set of questions: is the false or fraudulent information central, or peripheral, to the conclusions of the paper in question? Obviously if the primary point of a paper turns out to be fundamentally incorrect (as in my own retraction for a coding error), retraction is a fair outcome. But what if it is a minor aside? Let's say Matt Daemon describes a detailed experiment on how the proportion of feces in dirt affects plant growth on Mars. And he says in the paper that he was wearing a blue space suit when in fact it was green. That's a falsehood, but irrelevant to the biology. It should be corrected, but has no bearing on the conclusions. Or, he falsely reports that feces helped grow potatoes on Mars, then builds and obtains analytical solutions for a set of ordinary differential equations that correctly establish that nutrient addition would help. The math is true. Do you retract it? Doing so would be leveling accusations at an innocent equation by assuming guilt-by-association.

The last thing I want to comment on is the role of the journal, and those who volunteer time for it, in publicizing retractions. If the point of a retraction is to correct something we now know is false, we must reach out to people who previously read the paper and notify them of the retraction. Otherwise, they will continue to hold a false belief based on their original reading of the paper before retraction. Furtive retraction is no retraction at all. And, if the point of a retraction is punishment (which, again, is not my core belief), again the punishment has the most impact if it is public. Either way, retractions need to be clearly conveyed with the original paper on the journal website, with corresponding changes to Jstor, PubMed, Google Scholar, etc. Using social media to disseminate the fact of a retraction is entirely reasonable as a means to connect to readers who use that channel. The drawback of social media of course is the ease with which it tips into personal rather than professional. But any procedure that disseminates the announcement of one (or more) retractions will invite gossip, speculation, extrapolation, and personal judgements, especially when misconduct (or the strong suspicion thereof) is involved. 

The other interesting side of social media dissemination of Correction / Retraction decisions, is that it invites armchair editors to inveigh. I've been called a "coward" and "lazy" on social media for a decision to issue a Correction rather than Retraction, by someone who didn't know the paper, hadn't read it, didn't understand the negligible role that the data played, or the nature of the model being reported (not parameterized with or testing data). If there's one thing I hope the essay above has taught, is that retraction and correction span a diverse set of complex considerations. There are considerations of scientific fact or error. These may be central or peripheral to the paper. They may be reached by ethical or unethical means. They may be old errors or new ones, satisfactory by the standards of their day but not today. And each of these dimensions is a continuum ranging from totally-fine to absolutely-bad with grey areas in between. Sometimes, questions of fraud or retraction are unambiguous. Editors should engage fully with these and do the right thing. But challenges to papers can also fall into grey areas, or be good by one criterion and bad by another. The Editor's challenging job (supported by their editorial board and reviewers in consultation with authors) is to reach a final judgement. These are, in my experience, sometimes easy and sometimes hard. So, if an Editor simply ignores critiques, that's bad. But if they don't act exactly the way you would like, consider that they are weighing many themes and may have a justifiable (not necessarily cowardly / lazy) rationale for their decision.


Thanks to J. Shukla for inspiring conversation.


PS, added after: For some reason this site isn't allowing me to reply to comments. My apologies. So a couple of responses here.
1. I remain horrified that Google Scholar and WebofScience don't pair retractions with their targets so when a paper comes up in a search you immediately see it is retracted.

2. I haven't ever received a request to retract a paper because the author was engaged in personal misconduct (e.g., sexual harassment, etc), so the scenario mentioned there is hypothetical, meant to be thought provoking and ask, what are the boundaries between misconduct that warrant retraction, and misconduct that does not. The Committee on Publication Ethics standard is that it should be pertinent to the conclusions of the paper, but that has fuzzy boundaries.

A 25-year quest for the Holy Grail of evolutionary biology

When I started my postdoc in 1998, I think it is safe to say that the Holy Grail (or maybe Rosetta Stone) for many evolutionary biologists w...