Thursday, January 30, 2020

The Pruitt retraction storm Part 1: The current state

At the suggestion of a colleague, this blog post is meant to document the status of Dr. Jonathan Pruitt's publications, two of which have formally been retracted.

Which papers are retracted?
Which have been checked and confirmed to be sound (and, by whom), so we may continue to confidently cite them?
Which papers are currently being discussed but are not yet to the point of being retracted or cleared?

HERE IS THE LIST  (**** A WORK IN PROGRESS)

The key motivation here is to highlight papers that remain reliable, for instance when data were generated by students or postdocs other than Pruitt.  At present, pending the findings of a formal inquiry, it appears that Pruitt-generated data are the common theme in the known retractions and concerns. It is crucial that we not discard sound science produced by junior scientists working with Pruitt, merely by association. For this reason, I personally would discourage people from dismissing all Pruitt-co-authored papers out of hand.

I am generating this list based on email and social media communications; I welcome new information by email or otherwise and will keep this page up to date (daniel.bolnick@uconn.edu). I am also following the pubpeer website which seems to be a spot where ongoing concerns are being discussed, but keep in mind these do not represent retractions, just conversations.

Another reason to maintain this page: many people may be simultaneously checking past data files for the kinds of flaws that led to two recent (and one pending) retraction. The pattern so far seems to be that Pruitt's co-authors are taking a great deal of time re-examining their past publications: checking for patterns in data files, redoing analyses on the existing data. This is a massive drain on their time, when what they probably need most is to focus on new research to help them recover. At the same time I am aware of some non-authors who are delving into the data files as well. This redundancy is perhaps good to a point (in at least one case a co-author did not find problems in the data that were then identified by someone more familiar with the other cases), but also is a drain on the field's collective time. Therefore, if a paper is actively being re-examined, and IF the researchers involved wish to be named, I can include their information on this page as well as contacts. However, so concerns do not linger indefinitely, please let me know when papers are cleared of concern, or when retractions are public.

Please note that for legal reasons the journals typically will not list in-progress retractions until they are official, as retractions are usually vetted by the publisher and lawyers before being made public.

The Google Spreadsheet documenting the current status of Pruitt papers can be found here

If you have additions or amendments or corrections please contact me at daniel.bolnick@uconn.edu

-Dan Bolnick


The Pruitt retraction storm Part 2: An Editor's narrative

This blog post documents my experience concerning retractions of  some of Dr. Jonathan Pruitt's papers. I am writing this from three perspectives. First, as Editor of a journal affected by the series of recent retractions by Dr. Jonathan Pruitt and colleagues. Second, as a one time co-author with Pruitt.  Third, as a friend of Jonathan's from long discussions about science at conferences and pubs over the past decade.

Disclaimer: this post represents my personal experience with minimal opinion. It is not the opinion of The American Naturalist, nor the University of Connecticut. This is not intended to cast aspersions or attack anyone.

A companion post will provide  a summary of the current state of retractions and validations of his papers [please email me updates at daniel.bolnick@uconn.edu],

A third companion post will contain reflections on what this all means for science broadly and behavioral ecology specifically.

Before diving in, I want to emphasize that parts 1 & 2 of this series are meant to be a strictly factual record of the sequence of events and communications that do not imply any judgement about guilt or innocence for Dr. Pruitt. For transparency, I should also reiterate that although I do not know Jonathan well, we have been academic friends for quite a few years.

1. A narrative of events

On November 19 2019, a colleague (Niels Dingemanse) emailed me a specific and credible critique of the data underlying a paper by Dr. Jonathan Pruitt and colleagues that was published in The American Naturalist, for which I am Editor In Chief.  The critique (from an analysis by Erik Postma) identified biologically implausible patterns in the data file used in the paper (Laskowski et al 2016). Specifically, certain numbers appeared far more often than one would expect from any plausible probability distribution. For specifics, see the recent blog post explanation by Kate Laskowski. The complaint did not make specific accusations about how the suspect data might have come to be.

The lead author on this paper, Dr. Kate Laskowski, had received data from the last author, Jonathan Pruitt. Pruitt does not contest this point. She had analyzed and written the paper, trusting that the data were accurate.  On hearing about the odd patterns in the data, she did exactly the right thing: she examined the data files herself very carefully. She found additional odd patterns that have no obvious biological cause. She asked Jonathan about these patterns, as did I.  His initial explanation to me did not satisfy us. He said the duplicated numbers were because up to 40 spiders were measured simultaneously and often responded simultaneously, and were recorded with a single stopwatch. The methods and analyses did not reflect this pseudoreplication, so Jonathan offered to redo the analyses. The new analyses did not recover the same results as the original paper. Moreover, the duplicated numbers were in fact spread across spiders from different time points, webs, etc, so his initial rationale did not explain the data. At this point, Dr. Laskowski decided to retract the American Naturalist paper because she could not determine how the data were generated. She obtained consent from her co-authors including Dr. Pruitt, on wording that included the acknowledgement that the data were not reliable (without specifying how), and that Pruitt had provided the data to the other authors. This retraction was then run past the University of Chicago Press publisher and lawyers, then copyedited and typeset, and published online in mid-January.

Dr. Laskowski also examined two other Pruitt-provided dataset, one for a paper she also lead authored with Pruitt in Proceedings of Royal Society B, and one for a paper she co-authored. The former paper is now officially retracted. Her analysis and request for retraction of this PRSB paper was concurrent with the American Naturalist one, and PRSB Editor Spencer Barrett and I were in close contact through this process. Problems with the third paper were brought to our attention on January 13, by Dr. Laskowski. The retraction of the third paper is being processed by the journal and we were asked to not publicize specifics until the journal posts the retraction statement.

My involvement in this process was to field the initial complaint, email a series of queries with Jonathan seeking explanation, and to accept the retraction of the American Naturalist article without passing judgement on the cause of the problems with the data.  However, once the authors requested the retraction (of both the AmNat paper and the PRSB paper), I consulted the Committee on Publication Ethics guidelines, in depth.  Four points emerged.

First, it is clear that when oddly flawed data lead to a retraction, the Editor is supposed to report this to the author's Academic Integrity Officer (or equivalent).  I contacted the relevant personnel at Pruitt's current and former institutions to notify them of concerns. Pruitt's current employer is best positioned to conduct an inquiry. It is not my job, nor is it even my right, to render judgement about whether data were handled carelessly to accidentally introduce errors, or whether the data were fabricated, or whether there is a real biological explanation for the repeated patterns. So I encourage community members to not engage in summary judgement and await the (likely slow) process of official inquiry.

Second, Editors of multiple affected journals are encouraged to communicate with each other (which I have done with Spencer Barrett of PRSB and other Editors elsewhere) to identify recurrent patterns that might not be clear for our own journals' smaller sample of papers.

Third, it seemed wise to investigate the data underlying other articles that Pruitt published in The American Naturalist, for which I am responsible as the journal's Editor. I asked an Associate Editor with strong analytical skills (which could be any of them) who is not tied up in behavioral ecology debates (e.g., a neutral arbiter) to examine the original concern then to examine the data files for other papers.  The AE put in an impressive effort to do so, and reported to me that at least one paper appeared to have legitimate data and results (on Pisaster sea stars), but other papers had flaws that to varying degrees resembled the problems that drove retraction. The Pisaster paper apparently involved data collected entirely by Pruitt, so make of that what you will, but we found no evidence of unrealistic patterns. Analysis and discussion concerning the other papers is ongoing, we have not yet rendered a judgement.  It is the author's prerogative to request a retraction, and in a desire to approach this fairly we are giving authors time to examine their data closely, exchange concerns with Pruitt (who is in the field with limited connectivity), before reaching a final decision on retraction. The Associate Editor also examined a few files for Pruitt articles at other journals, and found some problems which we conveyed to the relevant co-authors and journal Editors.

Fourth, it seems clear at this point that the data underlying a number of Pruitt papers are not reliable. Whether the problem is data handling error, or intentional manipulation, the outcome will be both a series of retractions (the two public ones are just the beginning I fear), and mistrust of unretracted papers. This is harmful to the field, and harmful especially to the authors and co-authors on those papers. Many of them (myself included) were involved in Pruitt-authored papers on the basis of lively conversations generating ideas that he turned into exciting articles. Or, by giving feedback on ideas and papers he already had in progress (Charles Goodnight, for example, is second of two authors on a Nature paper with Jonathan, having been invited on after giving feedback on the manuscript). Or, often they were first authors who analyzed data provided by Pruitt and wrote up the results.  These people have seen their CVs get shorter, and tarnished by the fact of retraction. They have experienced emotional stress, and concern for how this impacts their careers.  I want to emphasize that regardless of the root cause of the data problems (error or intent), these people are victims who have been harmed by trusting data that they themselves did not generate. Having spent days sifting through these data files I can also attest to the fact that the suspect patterns are often non-obvious, so we should not be blaming these victims for failing to see something that requires significant effort to uncover by examining the data in ways that are not standard for any of this. So to be clear, the co-authors have in every instance I know of reacted admirably and honorably to a difficult and stressful situation. They should in no way be penalized for being the victims of either carelessness or fraud by another whom they had reason to trust.

  As the realization dawned on me that (1) many people were going to be affected, and (2) they are victims, I felt that a proactive approach was necessary to help them.  Dr. Laskowski for example was seeing some of her favorite articles retracted, while she is junior faculty at a top-notch institution. For some of Pruitt's more recent students, the majority of their publication list may be at risk.  With this in mind, I agreed with Dr. Laskowski that public acknowledgement of the retractions was the best strategy (via twitter and her blog).  I was deeply relieved to see the intense outpouring of support, sympathy, and respect that she and her fellow victims deserve.

Fundamentally I believe that if we stigmatize retractions, we will see fewer of them and the scientific record will retain its errors longer than we'd like.  When mistakes are found, transparency helps science progress and move on more quickly. I experienced this myself when I had to retract a paper because of a R code error (it was the first paper I published using R for the analyses), and received very positive support for the actions (blog about that retraction is here). So I encourage you to continue to support the affected co-authors.

Because the first retraction came out in The American Naturalist, and because of Dr. Laskowski's tweets tagging me, I inadvertently became a go-to participant in the process. I have received numerous emails every day this January about data concerns, retraction requests, and related communications. The process has often engulfed half to all of my day several days per week. Most of these I responded to as I could, or forwarded to the relevant people (Editors, Academic Integrity Officers, etc), redacting details when the initial sender requested anonymity.  Analyses and discussion of some of the emerging concerns can be found here.

The Associate Editor I mentioned above went as far back as digging into some of Pruitt's PhD work, when he was a student with Susan Riechert at the University of Tennessee Knoxville. Similar problems were identified in those data, including formulas in excel spreadsheets where logic and biology would suggest no formula belongs.  Seeking an explanation, I had the dubious role of emailing and then calling his PhD mentor, Susan Riechert, to discuss the biology of the spiders, his data collection habits, and his integrity. She was shocked, and disturbed, and surprised.  That someone who knew him so well for many years could be unaware of this problem (and its extent), highlights for me how reasonable it is that the rest of us could be caught unaware.

Meanwhile, I have delved into the one dataset underlying my co-authorship with Pruitt (a PRSB paper on behavioral hypervolumes). The analytical concept remains interesting and relevant, so not all of that paper is problematic. But, the analyical approach presented there was test-run on social spider behavior data (DRYAD data) that does turn out to have two apparent problems: an unexpected distribution of the data (not as overdispersed as we'd think it should be for behavior data);  some runs of increasingly large numbers that do not make sense; the mean of the raw data file of 1800 individuals is basically exactly 100.0; there are many duplicate raw values; and an excess of certain last digits that data forensics suggests can be a red flag of data manipulation BUT IS NOT CONCLUSIVE. Neither problem is a smoking gun, neither is as clear as that of other articles. We have requested a response from Pruitt, who is traveling doing field work in remote locations at the moment, and are holding off on deciding to retract the paper until we see a response.



The last thing I want to say is that I am increasingly intrigued and troubled by the lack of first-hand wittnesses who actually did the raw data collection, and the lack of raw data sheets . If any one was an undergrad with Pruitt who can attest to how these data were collected, their perspective would be very very welcome. ****(I have since been contacted by two undergraduates who did collect data for Pruitt; they confirm that data were recorded on paper, that the experiments they were involved with were actually done).

-Dan Bolnick
January 30, 2020


Some follow up thoughts added later:
1. These investigations take a great deal of time per paper (The AmNat retraction took 2 months of back and forth and data examination and wrangling over retraction wording), there are many papers. Be patient and do not assume every paper is unreliable, please.

2. The co-authors did not catch the flaws in the datasets, it is true, but having been deeply involved in examining these data the red flags that have cropped up all were revealed by the kinds of analyses and digging through original data looking for duplicated runs of numbers, that are not habitual automatic things to do to raw data. Not having had reason to mistrust, what they did (proceeding to analyze the data ) was quite natural.


Friday, January 24, 2020

Writing Retreats


This last fall, my students organized a writing retreat at the McGill University Gault Nature Reserve on Mt. St. Hilaire. We all had such an amazingly positive and productive experience that I wanted to tell all the PIs out there about it – and all the students too – so that they can lobby their Profs for something similar.


In the hopes of creating a shared experience, we decided that everyone would, then and there, start and (ideally) finish the draft of the introduction to a new paper that they needed/wanted to write based on data they had or were collecting. To further generate a shared narrative, I started by presenting my baby – werewolf – silver bullet metaphor for writing papers: detailed here. We challenged ourselves to – within 15-20 min – each come up with a single sentence for the baby (what people care about with respect to the overall topic of your paper), a single sentence for the werewolf (something that is not well understood about, or is a problem with, that topic), and a single sentence for the silver bullet (how a study can fill that understanding and therefore kill the werewolf and save the baby).

Each student then quickly presented their baby-werewolf-silver bullet sequence to the group for rapid feedback. Then it was off to the races. Each student worked on expanding their ideas into a true introduction while I circled around the room from one person to another to provide help and advice and to quickly read over what was being written. Babies came and went to be replaced with other, more adorable, babies. Werewolves were found to be not very scary – or unkillable – and so were replaced with other werewolves. Silver bullets were polished and refined. Introductions took shape.
 
Then it was time for an awesome chilli dinner and then trivia (biodiversity related) and scientific karaoke (each student randomly presented the research of another student based on 1-3 slides provided by that student). Then we told war stories from the field until late in the evening (early in the next morning). Ticks. Bears. Snakes. Cliffs. Bear attacks. Deer attacks. The next morning we continued our work, had a good walk and headed back to the real world.


We all really liked this writing retreat. I had a great time working with everyone on the formative stages of their papers. The students enjoyed hearing how each other student’s work could be interpreted in the context of a baby-werewolf-silver bullet context. Several students noted that it was hard to get writer’s block because quickly exchanging ideas with me or the other students would immediately allow them to progress down new avenues. We all felt excited about writing and invigorated about our various writing projects.

This enthusiasm has continued to the implementation of follow-up mini-writing retreats. Now, every Friday, we reserve a room in the graduate student house at McGill and continue the process. Students sit around at table or on couches and – simply – write. I walk around, have a seat by one or another, read their work, discuss their ideas, and just all around enjoy the process.

Perhaps it isn’t too late to teach an old professor new tricks. My normal way of writing was to have meetings with students individually to discuss things, then they would go off and write, then I would receive the paper and edit it intensively by myself at home, then I would send it back, rinse and repeat. Now we will write papers – or at least parts of them – together. I can’t wait until next Friday afternoon. But, in the meantime, I had better get back to editing this MS that a student sent me.

#IntegrityAndTrust 5. With Data Editors, Everyone Wins

Maintaining Trust AND Data Integrity - a forum for discussion.  INFO HERE . #IntegrityAndTrust 5. With Data Editors, Everyone Wins. B...