Now, it seems, someone has taken on the task of analyzing a database of self-citations that includes more than 100,000 scientists. They calculate a number of indices of impact for authors with and without their self citations. Here now was the chance to figure out my true self-citation impact in a large pool of scientists in related fields.
Of course, a number of caveats can be kept in mind. First, the data are from Scopus, which is much less complete than Google Scholar - so the reported citations are far lower for everyone. For me, for instance, my current citations are 20,101 on Google Scholar (h-index = 77), 13,567 on Web of Science (h-index = 63), and 13,250 on Scopus (h-index = 63). Yet, as long as no bias exists, perhaps it is still a reliable indicator of self-citation impact. Second, the authors calculated a number of indices of impact, some of which seem to be completely nonsensical. So I merely used total citations and h-indexes.
Has anyone noticed what a train wreck their composite indicator is?— Carl T. Bergstrom (@CT_Bergstrom) August 22, 2019
It sums logs of a) # citations to solo authors papers + b) # cites to solo or 1st auth papers + c) # cites to solo or 1st or last + d) # cites to all papers.
But a ⊆ b ⊆ c ⊆ d. Why would you do this?
To calculate my self-citation impact relative to everyone else, I first sorted to include only the categories "ecology" and "evolutionary biology", yielding 2126 people. Then I plotted h-index in 2017 including self-citations versus h-index in 2017 excluding self-citations. (The first time I posted this, I had the axes reversed - the current version is corrected.) On this, I plotted the two authors of this blog and also Steve Cooke, who has written a spirited defense of self-citation.
A second point is that Steve Cooke is pretty close to the best at it. In fact, he has 12th highest n-index in the entire database of 2126 ecologists and evolutionary biologists. Again, self-citation is not necessarily a bad thing. Check out Steve's paper on Self-citation by researchers: narcissism or an inevitable outcome of a cohesive and sustained research program?
A final point is that, really, self-citation doesn't matter much. In fact, a regression through the line yields an r-squared of 96.7%. In short, the variation among researchers is vastly higher than the effect of self-citation within researchers. Everyone can chill out.
But, more importantly, are these counts and indices and ranks even useful. Much has been written on this topic, much of which I agree with. I had my own take in the post Should I be Proud of my H Index?