Comments on Eco-Evo Evo-Eco: Archiving Primary Data (Or Not)

Andrew, I couldn't agree more with your points...

2016-01-02T21:30:17.199-05:00

Andrew, I couldn't agree more with your points. Usage of data without full knowledge of the system seems an odd excuse, particularly given that data will outlive researchers (see our small piece on Act to staunch loss of research data). Moreover, given that most data generation are funded by taxpayers, they should be perhaps seen as public patrimony (well government "secrets" are too and lots of people don't have an issue with them, so I heard from a few researchers that are not keen in sharing data).

One small addition to the good points you made is that availability of data also helps training students. At least two reasons for that: 1) Analyses of existing data regarding its original purpose help students to re-create the original results, thus helping students to practice the analyses used in its original purposes. This is particularly useful today when so much code is shared; 2) Create discussion groups among student test new ideas (e.g., meta-analyses, new hypotheses, etc). The two are obviously not exclusive to students but they help a great deal in the early stages of developing intuition and creativity within academic endeavours.

2016-01-02T21:26:21.734-05:00

This comment has been removed by the author.

Thanks for writing this thoughtful piece. I'm ...

2015-12-18T13:48:42.260-05:00

Thanks for writing this thoughtful piece. I'm 99% sure that Dryad screens out robots accessing the data, so those should all be real views and real downloads by real people.

I'm probably a fundamentalist when it comes to mandatory data sharing at publication, but this stems from two observations:

1) this is science, where you have to provide evidence for your conclusions. The data are without doubt part of that evidence, and withholding the dataset makes about as much sense as withholding the tables and figures.

2) far too many researchers cannot be trusted to preserve their data in the long term, and cannot be trusted to provide the data whenever it is requested (cf the Current Biology paper that Carl points to). Having the data public from the moment of publication is the only effective solution.

I agree that there are downsides for individual authors when they share their data at publication (risk of getting scooped, etc), but these are vastly outweighed by the benefits to the community from having a) access to the evidence underlying the paper, and b) the ability to re-use the data for new purposes. It's therefore up to funding agencies and journals to enforce the public good of data availability even when authors are reluctant.

My own data sharing experience is certainly much h...

2015-12-14T16:58:04.926-05:00

My own data sharing experience is certainly much higher than 1/4, more like 4/5 but maybe I am lucky or asking for less important data.

As for music sharing, the analogy is the use of something produced at the expense and effort of someone else without compensating that person(s). I don't mean financial compensation, I simply mean that forcing someone to put their data freely online means that people who wish to use that data do not have to contact the originator to discuss potential shared use (or other compensation) of the data, which might involve coauthorship - for example. Note that I am not advocating for or against this perspective, I am merely pointing out that the attitude as now applied to data archiving is probably a logically extension of the attitude that has arisen out of music sharing.

Andrew, very nice and balanced post. Particularly...

2015-12-13T18:33:42.566-05:00

Andrew, very nice and balanced post. Particularly interesting to see both your stats and reflections on your own Dryad downloads.

I was a bit confused about the analogy to music sharing. I think we can agree that skeptics of data sharing policies would not be mollified simply by making repositories like Dryad into subscription-based systems you had to pay to access, right. Paying for access may be central to the open access discussion for literature, but I don't see the connection when it comes to data archiving.

If you haven't seen it, you might be interested in the study of Vines et al, http://www.ncbi.nlm.nih.gov/pubmed/24361065, which (like studies in other fields) shows data sharing "on request" to be ~ 1/4 for recent papers and to deteriorate over time; though predominantly not because people are unwilling to share but rather cannot be reached or no longer have the data readily available. Repositories can be a benefit to data producers too.