Saturday, January 8, 2022

The things we wished we had known

 Becoming a new professor is exciting. You are at last the captain of your own research. You pick who you want to work with, what you want to work on. You come into the lab in the morning, and there are grad students and postdocs and undergrads all plugging away on things that excite you. It's a real thrill. 
But, it is also an intensely stressful time. Speaking for myself, my first year or two there were definitely days and weeks I felt so overwhelmed I just wasn't sure I'd manage. Its not something one often says openly, but yes there were times I felt depressed, there were times I cried. It's rough, and tiring, and stressful.


To me, there's no question that the benefits exceeded the costs, even in that first year or two. And as time went on and I grew in confidence, the benefits only grew and the costs mostly declined. Today, I can honestly say that I love my job, even though yes there are absolutely still times when I just need to walk away and take a break. And times when I feel I can't take a break, but need one. But this led me to reflect on what might have been different, to make those first years more pleasant and manageable. I think one element just isn't fixable: with experience comes confidence, and efficiency. You learn how to do things, to trust yourself, and to make decisions faster with that greater confidence. That saves time and reduces stress. But, there were many many other things that I think could have been far easier had I just had a bit more warning or training in advance. 


Today's blog post is devoted to those things that I wished I had known. A  few months ago I posed this question on twitter, and so the following actually reflects not just my own experience, but the responses from many other people who replied at the time. What follows is a non-exhaustive list of some of the things we wished we had known, to make the transition easier. These are offered not to scare or deter, but with the view that being forewarned is the surest key to being prepared, and thus being able to manage the new job. Also, keep in mind reading this that as much as there's weird BS we need to put up with in academia, other career paths for highly educated folk can be at least as aggravating (talk to your friends who are physicians about paper work, or your lawyer friends about their workload and email inbox and meetings. The list is presented as bullet points taken from the twitter responses, anonymized and sometimes (but not always) rephrased, some with additional commentary from me. I also drew on responses to a related thread


General

* Be really really really nice to the office staff, they are your most important allies in the whole place.

* Your university hired you because they are impressed with you and like what you do, and how much you've done. Just keep doing that and you are on the right path. They don't hire people who they expect to fail, the hire people who are already on a trajectory to meet tenure expectations. You've got this.

* You wanted this job. Remind yourself of that each time you feel stressed out.

* Admit to mistakes. When you mess up, own it, apologize, and fix it. You'll gain more respect for dealing with mistakes head on, than you'll lose from making them.

* Following on the previous point: When your students make mistakes, don't punish, just teach them the previous philosophy: admit, fix, and move on.

*Update your CV continuously as you do new things, so you don't forget

* Luck plays a big role in success.

* Be prepared for setbacks - experiments fail, a freezer dies, your course lecture goes off the rails, a grant is rejected or paper declined with unnecessarily nasty reviewer comments. Lousy things happen. That's true in any career. Take a deep breath, move on. I often set aside reviews, unread, until I'm psychologically  over the initial disappointment and can read the comments more dispassionately.

* Learn new things

* There will always be someone who is more successful than you at some thing you'd like to succeed at. Its normal and okay.

* That impostor's syndrome you keep feeling? Remind yourself that (a) pretty much everyone who isn't a pathological narcissist feels it, and (b) at least you aren't some famous person who got that way by fabricating data.

* "How alone you are"  - a common sentiment. You start academia surrounded by lab mates, and mentors, then wind up at the tip top of the pyramid with surprisingly little time to hobnob with peers and mentors. Speaking for myself, I had a faculty mentor when I was an Assistant Prof, but we never met to discuss how I was doing, not once.

* The university does not "care" about you. This is an employer-employee relationship that is purely transactional and somewhat exploitative. One respondent said, if it has to be exploitative, at least make it mutually exploitative.

* You really really CAN say "NO" to things.

* REALLY understanding that tenure is about *demonstrating* your value to the University in terms of revenue (grants), quantifiable prestige (papers), and sustainable business plan (getting support for students). Spend 90% of your time on this

* An old policy I learned from camping: Leave the place better than you found it when you arrived.



Financial

* Most of your research is done by people, not stuff, so budget accordingly. No point generating $100,000 in sequence data if nobody is around to analyze it and write (you won't have time).

* Apply for the award / grant / job even if it seems like a stretch. Do the same for your trainees - nominate them for things.

* I need better tools for keeping track of one's actual budget (initial budget and then expenditures, which quickly start to deviate from initial plans). I've been at three universities since starting grad school and none had clear tools for me to track and project expenses.

* Budgeting. "OMG budgeting"  (PS - if someone wants to write a guest post on the brilliant solution they've found for multi-grant budgeting and monitoring expenses, let me know!).  (Note added after: @tera_levin responded that spendlab.org is amazing. "Easy to project forward to when grants run out, when to move trainees to different grants, and even internalize your burn rate and plan future grant needs"

* The difficulty in spending money the way you planned & budgeted to. Writing extensive justifications for why I bought 10,000 empty tea bags (to put fish in, of course), because someone in the accounting office figured this was a misuse of funds. 

* Travel expenses take forever to get reimbursed. Personally I had nearly $13,000 in expenses I paid out of pocket and had to get reimbursed and it took nearly 9 months, partly because a hobgoblin in accounting figured I should have rented a economy sedan car (for field work with 5 people for 2 weeks) rather than an SUV or pickup.

* I have to pay for hand soap and paper towels? Unexpected expenses are everywhere

* It is psychologically challenging to get used to spending huge amounts of $ (more than you have personally) on lab equipment, supplies, salaries. Buying something that costs $100,000 feels really really stressful.

* The university overheads can be an obscenely large slice of your budget.

* get a copy of your postdoctoral lab's ordering info so you don't waste time figuring out what to order

* Lab supply vendors have discounts for new PIs. Use them.

* Don't hoard your research funds forever. Data and people are better than unspent funds.

* Apply for new grants before your current one(s) are done.

* Start-up matters, but using it wisely matters more

* Publication fees (especially Open Access) add up fast, plan accordingly.

Red tape

* How to navigate Institutional Animal Use and Care approvals at any new institution. Who's the vet, what are the forms & other rules. I have had to re-learn this at each institution, each of which has different expectations and forms.

* When I started at University of ____, there were a ton of biosafety and lab safety classes to take, but no exhaustive simple list aimed at faculty in my department, so one had to learn by either taking lots of time to locate the required classes, or by being reprimanded for not having completed something

* Each institution has its own unique expectations for chemical inventories.

* How hard it can be to purchase. For my favorite example involving two shipping pallets of pipettes, see this tweet.

* How do I get furniture for my office?

* Ordering atypical equipment: I budgeted for a field vehicle in my start-up. Four years later the university still can't figure out whether I'm allowed to purchase a vehicle, and start-up is almost expired (despite extension).

* What's up with people referring to every class with some arcane number instead of a subject name. "You've been asked to teach BIO 3286b 3/4... can you do it next semester?"  Heck, thirteen years into my first faculty job I still didn't know the number for my own course.

* The lag between asking for permission to do a project (grant funding, IACUC permission, collecting permits), and actually being allowed to start - can take months to years.

* The nightmare of shipping biological material, particularly between countries.

* So much of the university administration is geared towards not getting sued / passing audits, rather than actually enabling teaching or research.

* "Nothing could have prepared me for cost share agreements, constantly changing fringe/mileage/tuition. Nothing"

* A general response was, how long it takes to do anything (ordering, hiring, new training).

* WTF is up with people's obsession with parliamentary procedure and Roberts Rules of Order?

Everything I buy must be tax exempt, but every supplier has a different university phone number or in-store code they want you to provide to prove tax exemption. No employees of any of these places seem to know the number. Would be nice to give new faculty a spreadsheet of them

* "Also, at some point I adapted the strategy of “don’t ask for permission, ask for forgiveness” just to get shit done and not be held up by stupid rules that are either useless or not communicated."

* The annual progress reports. At my previous institution we needed to generate three different annual reports each year (each for a different admin level, each on a different form).


Time management

* Leave time for yourself to write, do lab work, get exercise.

* Your spouse and kids need you more than your students do. 

* Following on the prior, this is a job, not your entire lifestyle. Treat it accordingly.

* Service activities count very little towards tenure. Do some. But as little as you can manage, and make them the fun ones.

* Don't do every seminar invitation and conference that you are invited to. It'll quickly grow to be too much. There's an optimal ratio of doing science versus telling others about it, and too much of the latter can be tempting, but start to undermine you.

* Triage - not everything is equally important.

* Block off time in your calendar for you to work uninterrupted on a paper, grant, class, etc.

* The sheer number of meetings: faculty meetings, department and university service committees, dissertation meetings, meetings with collaborators, meetings with students, meetings with undergrads.

* The number of emails per day

* The ratchet of expectations, that keep getting added on as you get accustomed to doing the previous round of expectations.

* Often, scheduling a half hour meeting once a week with a trainee is just fine. Adjust depending on the balance of your needs, and those of individuals. It is okay to meet with some more often than others depending on their career stage (e.g., writing dissertation) or psychological needs.

* Budget time to answer emails, ignore it other times.

* Review manuscripts for journals. Get them in on time, do a thorough job. Do enough of these to pay back for the reviews you get (e.g., 2-3 reviews per paper you submit).


Hiring

* Don't let your lab get too big too fast, before you know how to supervise.

* The length of time and paperwork to initiate a hire, then to conduct the search, then to actually hire.

* Good trainees are key. When hiring, watch out for red flags when interviewing, especially whether they work well with others, and can take risks and finish projects, and can they write. And most of all, are they honest? The very top classroom students as undergrads may get paralyzed by uncertainty and risk in the lab, so don't focus just on their grades.

* No assholes in the lab

* A bigger team is not always better. But, also know that the work you devote to mentoring may not increase linearly with group size. As your group grows, there's more lateral mentoring that happens among your trainees, as they learn from and support each other.


Mentoring

* Train your people in good data management and code annotation, it is as crucial as good experimental design.

* Assemble a list of readings for your lab members to have a shared baseline of core literature familiarity

* Be kind to the people who work with you - buy their lunch on occasion, you make more than they do.

* The big jump from going from interacting with peers, to interacting with people I supervise

* Supervising people is really hard and we don't get trained on it. What to do if someone is suffering serious mental illness that is impacting their ability to progress on a degree, or do their job, or endangering research animals? What to do if people are in conflict within the lab? So many scenarios, so little advanced preparation. "no one trains PIs to actually manage people"

* "I didn't expect to be a therapist for my students. The lines between mentorship/advising vs therapy are blurry when students desperately need someone to talk to". Someone else wrote "I'm totally not equipped to help the ones who need long term psychiatric care /meds", and I'll say this has come up multiple times in my own lab and after 17 years as a prof I still am struggling with helping. I'm not a psychiatrist.

* You are asked to articulate a mentoring style, but the fact is every student is unique in their needs and personalities, so you need to create a unique mentoring style for each student to be effective.

* Your students are your colleagues and collaborators, and will be for life if you treat and train them well. They are not your slaves, nor are they your kids.

Set a culture in the lab where no one feels like they work FOR you but instead they feel like they work WITH you. It may sound subtle but it is so important!

* Clearly communicate expectations with your lab members about their training goals, behavior, etc. A lab culture document is a good idea.

* If you are to be a senior co-author on a paper, you are vouching for the work. Can you confirm it was done, the person knew what they were doing, and that the data is true, and the analyses done as described? Look at the data, look at the code, and be

* Don't be the weak link in the chain. If a student sends me a manuscript, it'll take me a day of work to get them comments on it. That day could be this week, or weeks from now, but takes the same amount of time whether I do it now or wait. If I wait, I'm slowing down my student's progress. Therefore, better to do it now.

* Have individual meetings and group meetings. In group meetings, don't just talk about research, also set aside time regularly to talk about ethics, data management, publishing, academic culture, job tracks, etc.

* Work on Individual Development Plans with your trainees


Research

* For research, have a diversified portfolio: bread-and-butter projects that are guaranteed to yield basic publications, and some high-risk high-potential-reward projects.

* Pick at least one topic that you are going to be the go-to-person on, in your department or more broadly. This could be a skill set or subject area knowledge where people need help. This defines your intellectual niche, but also makes you indispensable to your colleagues who need your expertise to help their own group forward. Then they've got to tenure you out of self-interest.

* Arrive with a box of data to analyze and publish, so you can still be publishing data (e.g., from your postdoc) while you start new projects. The new lab projects can take years to hit the journals.

* Develop a library of lab protocols (your own, or others') using tools like protocols.io

* Archive your data (DataDryad.org for instance)

* The importance of remaining doing some bench work and/or field work. (it's why they hired you...)

* "I truly did not understand how little of the actual science in the field and lab I'd be doing. It was hard to let that go, to delegate and empower others while I did science admin & other stuff I wasn't trained to do"

* Share your toys

* Back up all lab data in redundant places, cloud and hard copy, and make sure your trainees do too. Make sure your students give you access to their data so you can check their data and their code, and finish projects if they leave things incomplete (some will!).

* If you do field work, make sure some people in the field have good first aid training. Bad things happen.


Colleagues

* You are faculty now. You really can speak in faculty meeting.

* Find collaborators who are good people first, good scientists second. Trust is key, so work with people who you enjoy and can rely on.

* Politics, often arising from jealousy over differences in output, grant resources, teaching loads, etc

* I was unaware of departmental politics for years as an assistant and associate professor. Then I landed an administrative position where I had an effect on other faculty's experiences, and the knives appeared.

* Doesn't anyone know how to properly use (or avoid) reply-all during email exchanges?

* Invite departmental seminar speakers. Choose some peers to generate collaborative networks, but importantly choose some senior people who will write tenure letters for you.

* Go to conferences and present to get your name out there. Be sure your students do too - to get their own names, and yours, out where people know what you are doing.

* Consciously build a network of mentors and peers, within your field (not at your university),and within your department.

* Try collaborating with someone in another department/discipline on something. It's eye opening and leads exciting directions.

* Call the program officer to discuss grant proposal plans, and to discuss grant reviews you received. Really, they don't mind. Get to know them at conferences.




Teaching

* Pick a topic to teach that you'd like to learn more about, it forces you to take the time to learn. I did this for stats with R, then Bayesian stats, then Network Stats.

* "Having trained mostly at med schools, was hilariously not ready for amount of time that goes into prepping/teaching an undergrad class. Ultimately a rewarding experience and easier after the first yr, but an surprisingly huge responsibility on top of setting up the lab!" - To this lovely quote, I'll add that this isn't unique to med school training backgrounds. I usually budget 1-2 days (that's FULL days) per 1.5 hour lecture to prepare. At a minimum. When I've said this on twitter I've received some pushback and abuse from people who didn't believe me, but who later admitted they reached the same confusion.


Miscellaneous

* Mid-career moves between universities are slow, and super-hard. Moving freezers is a nightmare. Fish get lost by fed-ex. Have to redo all your training (IACUC, safety, chemical, biological, how-not-to-sexuall-harass, don't-be-racist)... all of it is unique to each school and yet so repetitive. Universities don't give you credit for having done the same thing elsewhere. And then there's the shock of having to rebudget every grant you transfer - budget, budget justification, and more, takes a ton of time.

* How do I get a website set up? Can anyone advise me? Anyone?  This, and so many other things, seem expected but not explained.

* I've been at my institution for three years and I'm still not on the right listserv lists that I should be

* "Cockroaches short-circuiting equipment"

* A good coffee machine is key

* Get copies of people's grant proposals, fellowship applications, award applications, to have examples for you or your lab members.


Last, but not least:

* There are many paths to success, pick the one that feels most authentic to yourself.

* Anyone's advice is suspect. Just because it worked for them at their school doesn't mean it will work for you at yours. Conversely, others' horror stories may not apply to you.


For other advice on starting a lab see:

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007448


https://hhmi.org/science-education/programs/resources/making-right-moves…

https://totalinternalreflectionblog.com/2018/09/18/setting-up-shop-a-short-guide-to-starting-a-research-group/

http://users.fmrib.ox.ac.uk/~behrens/Startingalab.htm…

Tuesday, January 4, 2022

What happens to those Dryad repositories?

A decade ago, The American Naturalist and a few other journals instituted Data Archiving policies, coincinding with the start of DataDryad.  The rationale for this move was nicely articulated in a paper by Mike Whitlock and the other Editors of participating journals. In this blog post I want to present a little data analysis of what happens to my own data archives, since they were posted.

First, some rationales. The shift to data archiving has some now-familiar benefits:

1) It allows readers to check authors' claims against the original data. This has led to a number of high-profile retractions arising from either discoveries of flaws in the data that undermine the credibility of the claimed results, or errors in data analyses that fundamentally alter results. Journals that did not institute data archiving rules earlier, now find themselves struggling to evaluate potential flaws in previous data files.

2) If authors upload their data  archives before submission (include your code too please!), they can provide a private key to the journal when they submit, so that the data remains embargoed until publication but reviewers and editors can check the data. I have recently seen cases where reviewers missed flaws that were not obvious from the written manuscript, but emerged when the data files were checked. This practice also allows journals time to check the completeness and integrity of the data repositories, to ensure they have everything they should (a practice that The American Naturalist recently instituted, with help from DataSeerAI).

3) Archiving your data protects you from losing your own hard-won data. It's a back-up for you. 

4) Archived data provide students and colleagues with an opportunity to practice data analysis and learn how to do analyses the way you have done them. It is a teaching tool. Badly archived data inhibits this, because readers don't know what variables are which. So, follow good-practice guidelines to build usable and complete and clearly documented data archives.

5) Archived data can be used in subsequent meta-analyses. I have an idea for a way to analyze phenotypic selection data, that is different from Lande and Arnolds, for instance. I can't just draw on people's published estimates of linear and quadratic selection gradients for this, I need the raw data. So to do a meta-analysis, I need to go beyond what people put in their papers, to get raw data, and this requires usable archives. Note, around 2015 I emailed a hundred authors (most papers from the 90's and 2000's) asking for the data under their stabilizing selection estimates, I got four responses. I never finished the study (P.S., reach out to me if you're interested, I've not had time to follow up on this).

6) The last and probably least important benefit is that you contribute a citable published resource. Some people include their Dryad repositories as a separate entry in their CV, as it represents a product of your work, in its own right. Few people actually do this: see twitter poll results:


It is this last point, arguably the least important that I want to explore in depth today. I decided on the spur of the moment this morning, as a form of procrastination, to go ahead and add my data repositories to my CV: Authors, Year, Title, URL. Took me about an hour. As I did this, I began to notice a few points of interest I wanted to convey.

First, some basic stats:  My first data repository was in 2011 (in The American Naturalist!) and I've found 39 archived datasets with my name. I've published more papers than this in the past 10 years. I confess I was inconsistent with archiving in the early 20teens, depending on the journal, doing it when required. Some archives aren't on Dryad because they are archived with the publishing journal instead (e.g., my 2020 papers in Ecography and Ecology both used the publishers website for posting the archived data as supplements). And, some of my students/postdocs may have built archives that don't have my name on the archive. And then there are theory or review papers that don't merit archives.

The next thing that intrigued me as I began looking at these, is that my data archives were getting views, and downloads. Frankly this surprised me a bit. Not in a bad way, I just figured archives were sitting there in a dark virtual room, lonely. But people are actually looking at them! The average archive has been viewed 148 times (sd = 129), with the leading papers being an analysis of lake-stream stickleback parallel evolution (647 views, Kaeuffer et al Evolution), a meta-analysis of assortative mating (464 views, Jiang et al American Naturalist), and yeast epistasis (Kuzmin 2019 Science, 369 views). The Jiang et al one in particular didn't surprise me because it is a bit provocative and I know it stimulated some rebuttals and re-analyses. Here are is a histogram of repository views:

A important caveat: Some unknown fraction of views and downloads may be from bots with unknown motivations.

Dryad also tracks downloads. On average my repositories have been downloaded 47 times (sd = 77), and this time it is the Jiang et al AmNat paper that leads with 402 downloads followed by Kuzmin 2019 Science at 287. Only one paper wasn't downloaded at all, and that is the one paper still subject to a embargo (Paccard et al) because it uses an assemblage of data from many collaborators some of whom had yet to publish their own analyses. All told, my repositories have had a total of 5791 views and 1822 downloads. To be clear, I'm not trying to pat my own back here, my point is that data repositories get (to my view) a shockingly large amount of attention. I had no idea.


I'm not alone in being surprised that repositories are being downloaded and viewed. A quick twitter poll suggests most people who responded thought they would get few, if any, downloads. I bet you all will be surprised.



Now it'll come as no surprise that there's a time effect here. Older repositories have more time to accumulate views. And, repositories that are viewed more, and downloaded more




The last thing I wanted to check is whether a paper's citation rate is correlated with the data downloads. The answer is clearly yes. I used a general linear model with Poisson link to test whether the number of downloads (approximately Poisson distributed) depends on the year the repository was public, the number of views of the repo, and the number of citations to the paper. All three terms were significant, but the largest effect by far is that well-cited papers are the most likely to have their repositories downloaded:

Coefficients:

                         Estimate         Std. Error         z value         Pr(>|z|)    

(Intercept)        -1.034e+02      2.248e+01      -4.598         4.27e-06 ***

Year                5.268e-02          1.114e-02       4.730         2.25e-06 ***

PaperCitations  7.422e-03        3.438e-04      21.588          < 2e-16 ***

Views               1.968e-03          2.377e-04       8.278          < 2e-16 ***

Focusing on the paper citation effect (in a log-log plot): 


Note, the one red point there is a paper by Pruitt in Proc Roy Soc, that has suspect data that the co-authors (myself included) asked to be removed from as co-authors. I include it here out of morbid curiosity.

Why do more cited papers get more data downloads My guess, and it is just a guess, is that there's a mix of motivations led by a desire to try recreating results, a desire to learn how to do analyses, and simple curiosity. These downloads might also be class exercises in action. For example, I assigned my spring 2021 graduate class (on graphical analysis of data) the task of finding a paper they liked, with a figure they thought could be improved upon, and get the data repository and build a new and improved version of a figure of interest. Another option that April Wright suggested via twitter is that this is all driven by bots. But I struggle to see how bots would generate such a strong paper citation effect, as opposed to a year effect.

The last thing I want to note here is that the original proponents of data repositories argued that these are citable sources, that could accrue their own citations and their own impact factor. After seeing how much my repositories are downloaded, I do think that it is worth tracking total data repo downloads, at a minimum, though as far as I can tell there is no automated way to do this at present. But, citations to repositories are basically useless as far as I can tell. The repositories posted in the last 0-1 years have zero citations, apparently because it takes a while for the published article's citation to the data file to link to Dryad. But, for repositories published from 2011-2018, every single one had exactly 1 citation, and that was from the paper that reported the data. 

The upshots:

1. repositories are widely viewed, often downloaded, but never cited. We need a way to track this impact (and to exclude bot false positives).

2. your data repository is not a throw-away waste of time to be done in a sloppy manner. Prepare it carefully and clearly with complete data and clear README.txt file documentation. People are likely to view it, and likely to download it. If you go in assuming that people will actually look, you will feel compelled to make a better quality and more usable repository. And that's good for everyone, even if it takes a bit more time and care.


Update: According to Daniella Lowenberg of DataDryad, "DataDryad standardizes views and downloads against a code of practice written by @makedatacount & @ProjectCounter to ensure we eliminate bots, crawlers, double clicks, etc!"



For transparency, the data and code are provided here (not sure how to put a .csv as an attachment in this blog page, so sorry here's a table):

YearViewsDownloadsCitationsLeadAuthorPaperCitations
2011188201Agashe105
20116471031Kaeuffer206
2012106241Snowberg38
20131001191Hendry29
20134644021Jiang326
2014355921Bolnick142
2014146221BolnickOtto75
2014110261Stutz28
2015132251Jiang14
2015147571Oke61
2015139451Schmerer33
20155841Stutz23
2016199381Bolnick14
2016143121Bolnick21
2016158411Ingram6
2016121161Jiang6
2016175561Lohman104
20166871Lohman7
2016266771PRUITT16
2016219381Weber29
2016222631Weber46
2017102161Brock13
2017147371Lohman10
2017111161Stutz23
2017136771Thompson26
201781121Veen12
2018105241Bolnick6
20194361Edelaar17
20193692871Kuzmin162
2019000Paccard16
2019160201Rennison17
202016350Bolnick23
20203180Haerer3
20204140Maciejewski7
20205460Smocovitis0
2021810DeLisle5
202144110Haines1
20212440Peng1
2021911Vrtilek5


#Blog on Dryad

dat <- read.csv("DryadInfo.csv")
par(mar = c(5,5,1,1))
hist(dat$Views, col = rgb(0.2, 0, 0.3, 0.5), breaks = 14, xlab = "Views", main = "", cex.lab = 1.4)
mean(dat$Views)
sd(dat$Views)

mean(dat$Downloads)
sd(dat$Downloads)
hist(dat$Downloads+0.001, col = rgb(0.2, 0, 0.3, 0.5), breaks = 24, xlab = "Views", main = "", cex.lab = 1.4)

{
  plot(Downloads ~ Views, dat, pch = 16)
model <- lm(Downloads ~ Views, dat)
abline(model)
summary(model)
text(150,350,"t = 5.837, P = 0.000001")
}

model <- glm(Downloads ~ Year + PaperCitations + Views, dat, family = "poisson")
summary(model)

{
par(mfrow = c(1,2))
dat$Downloads[dat$Downloads == 0] <- NA
dat$Views[dat$Views == 0] <- NA
dat$PaperCitations[dat$PaperCitations == 0] <- NA

plot(log(Downloads) ~ Year, dat, pch = 16)
model <- lm(log(Downloads) ~ Year, dat)
abline(model)
summary(model)
text(2013, 0.5,"t = -0.29, P = 0.000054")
plot(log(Views) ~ Year, dat, pch = 16)
model <- lm(log(Views) ~ Year, dat)
abline(model)
summary(model)
text(2013, 2.5,"t = -0.22, P = 0.000006")
}



{
  par(mfrow = c(1,1))
  colortouse <- as.numeric(dat$LeadAuthor == "PRUITT")+1
  PCHtouse <- abs(as.numeric(dat$LeadAuthor == "PRUITT")-1)*15+1
  plot(log(Downloads) ~ log(PaperCitations), dat, pch = PCHtouse, col = colortouse, cex = 2)
  model <- lm(log(Downloads) ~ log(PaperCitations), dat)
  abline(model)
  summary(model)
  text(1.5,5.5,"t = 0.68, P = 0.000004")
}

sum(dat$Views, na.rm = T)
sum(dat$Downloads, na.rm = T)














Predicting Speciation?

(posted by Andrew on behalf of Marius Roesti) Another year is in full swing. What will 2024 hold for us? Nostradamus, the infamous French a...