Should self-citations be excluded when calculating the h-index?

First and foremost, I recommend reading the related question: "Why is it said that judging a paper by citation count is a bad idea?" That question may help relieve some of your concern about the importance of having an answer for this question.

Now, turning to H-index: to the best of my knowledge, there is no consensus as to whether self-citations should count or not count. Saying whether self-citations are included might be useful, but then it would also be useful to know quite a bit more about how the database for computing an H-index is being constructed.

My own thinking on pros and cons goes as follows:

Omitting self-citations means you get a clearer picture of whether other people are paying attention to one's work.
Self-citation is also entirely appropriate and legitimate and at moderate levels can be a good indicator of a healthy and ongoing research program.
Defining self-citation is not entirely obvious, when co-authors are taken into account. Consider the following: if A and B co-author a paper, and then B cites the paper, should that count as a self-citation for A too? It's not A who is citing, but the citation still might be "discounted."
Finally, precise values of H-index are not very valuable in any case, since bibliometrics are not very good at evaluating scientific impact.

Given all of these things, I personally think the best thing to do is to count self-citation in H-index and mark it clearly as such.

There's no firm consensus on whether to include self-citations. (For example, the original paper by Hirsch discusses how one could correct for self-citations but doesn't include this as part of the definition of the h-index.) The reason is that it doesn't matter: the h-index is a crude tool, and if your decisions make delicate enough use of it that the outcome may change depending on whether self-citations are included, then you are using it wrong.

For example, you mention a hypothetical case of someone whose high h-index comes primarily from self-citations. In a case like this, someone on the hiring/tenure committee should ask "Gee, why does this candidate have such a high h-index when the rest of the file gives little or no evidence that their work is influential or important?" Then a few minutes of investigation will reveal the truth.

There's nothing special about self-citations here. I know a case of an eccentric researcher in mathematics who gets a lot of citations from followers of his publishing in marginal places. The total number looks impressive, but if you look at where the citations are coming from, you find only rather weird-looking papers published in places you've never heard of. To keep from being misled by cases like this, you have to do some due diligence when you see a surprising number, and if you're doing that already then skewed h-indices from self-citation are not such a great threat. (In practice the skew is generally pretty small, too.)

The net effect is that if the hiring or tenure committee is just paying attention to numbers like the h-index, without any perspective or further investigation, then that's a major problem with their methodology. If they do notice oddities but feel compelled to give credit for a high h-index anyway, then that's an even worse problem.

In practice, different websites for computing h-indices can give substantially different values, depending on which sources they count citations from. If you care about specifying a well-defined number, then you need to tell exactly how the h-index was computed (which goes far beyond just whether self-citations are included).

Self citing is a common thing in medicine where good number of the papers are simply case reports and reviews. A faculty can co-author with large number of students and trainees and keep citing his/her previous papers. The publication numbers can be unreal, citations become mathematically multiplied and h-index will be high. I came across authors with average 3 papers per week and most of papers cited their own previous papers.
I agree that bibliometrics do not reflect scientific impact, it simply bedazzle those who look at volume rather than quality. How many times NIH reviewers count publications as a measure of candidate. Jobs in academia is the same.

Should self-citations be excluded when calculating the h-index?

Tags:

Citations

Bibliometrics

Related

Recent Posts