Why is peer review so random?

Good question. Hard to answer. Some thoughts:

  • reviewers are not trained
  • reviewers are anonymous
  • reviewers receive minor feedback on their performance
  • reviewers are also authors, competing for the same funds/ prestige
  • reviewers are specialized in a narrow discipline
  • reviewers are volunteers
  • reviewers are scarce
  • the review system lacks an external (independent) control system (audit)
  • reviewers are humans, with their own personal interests, emotions, capabilities

Considering these observations, it is unlikely to expect two review reports to be aligned. Then the difficult decision transfers to the associate editor who is also a volunteer and not specialized in the author’s field.

Leaves the question why it is accepted while outside science this wouldn’t be. Honestly, I don’t know. Just some guesses:

  • Science is a powerful isolated sector with its own rules?
  • The current system works for established research groups?
  • Jourals do not have the funding to train and attract qualified professionals/scientists as reviewers?
  • There is no easy solution or alternative?

Added based on comment: - reviewers are busy scientists - reviewers are career-wise not rewarded for conducting reviews


The biggest difference is that, up to PhD thesis level, the person doing the assessing is more of an expert than the person being assessed. In almost all these cases there is an agreed set of standard skills, techniques and knowledge that any assessor can be expected to possess and any assessee is being measured against.

This isn't so true of a PhD thesis, but in the end once a supervisor/thesis committee has green lit a student, almost all PhD theses are passed.

It's definitely not true higher up. In almost all cases the person being reviewed will be more of an expert in their work than anyone doing the reviewing. The only exceptions will be direct competitors, and they will be excluded. We are talking right at the edge of human knowledge, different people have different knowledge and skill sets.

I'm quite surprised that the GRE scores are so consistent. It’s long been known that essay marking is pretty arbitrary (see for example Diederich 1974[1]). Mind you 1 mark on a 6 mark scale is 15% – a pretty big difference. In our degree a 70 and above is a 1st class degree – the best mark there is, whereas 55 is a 2:2, a degree that won't get you an interview for most graduate jobs. Losing 15% on a grant assessment will almost certainly lose you the grant.

But even to obtain this level of consistency, the graders must have been given a pretty prescriptive grading rubric. In research, no such rubric exists; there are not pre-defined criteria against which a piece of research is measured, and any attempt to lay one down would more or less break the whole point of research.


With respect to the good papers being rejected problem, a factor that doesn't seem to have been mentioned yet is that the consequences of accepting a bogus paper are much worse than those of rejecting a good paper. If a good paper is rejected, it can always be resubmitted to a different journal. And if the authors first revise according to the reviewer comments, the version that ends up getting published may well be better written than the one that was rejected. All that's lost is time.

But if a bogus paper is accepted, other scientists may see it in the literature, assume its results to be valid, and build their own work upon it. This could result in significant lost time on their part, as experiments that depend on the bogus result don't work out as they should (which at least may lead to the bogus paper being retracted if the errors are bad enough). Or maybe they'll avoid researching along a line that would have worked, because the bogus paper implies it wouldn't, or worse, they'll end up with inaccurate results themselves and end up putting another paper with bad data into the literature. All of these are far worse outcomes than just needing to resubmit a paper, so false negatives are preferred to false positives when reviewing.

Tags:

Peer Review