Why are most scientific articles locked behind a paywall?

Once upon a time, before the internet existed, the only way to distribute scientific content to a worldwide audience was through print. There are obvious costs related to printed publications such as paper, ink, printing, distribution, etc. Commercial publishing houses were established, which took care of this task, as well as the editing, the type-setting, organizing the review process (mind that the reviewers are typically not paid, but they still have to be found, contacted, etc, and all of that also costs money in a pre-internet era).

Many of these commercial scientific journals gained a certain reputation over time, and it became attractive for scientists to try to publish in the highest valued journals. For the publishers it was (and is) attractive to maintain the journal's reputation (e.g. expressed in its impact factor, etc), in order to attract an abundance of high quality manuscripts, select the best ones, and keep a large audience.

This was (and to a large extent still is) the status quo when the internet arose. This is also basically the answer to your question.

Now with the internet, it is perfectly possible to reach a large worldwide audience without the costs of printed journals (e.g. ArXiv). Also, peer review could be organized in an alternative manner. However, the commercial publishers have a lot of interest in keeping the old business model alive, as it is the source of their revenue.

So why are things largely as they were 50 years ago? There are a couple of reasons for this. For one, there is the absence of a platform that fully replaces the ring of scientific journals, including a reliable peer review process (or accepted alternative). Furthermore, people tend to do what they are used to doing in the past, and senior researchers (the ones who take the decisions) are used to publish in the traditional venues, and teach the juniors to do the same. Lastly, many researchers are (at least partially) evaluated with respect to the reputation of the venues they publish in, which keeps the old and established journals alive.


The answer by Danny Ruijters already says a lot and mine will have some intersection with his or hers, but there is at least one forgotten crucial element. For simplicity I will consider mainly the classical, still dominant mode of publication, by subscription.


[Added in edit] Short summary:

  1. publishers charge for access because that is how they make money,
  2. they are able to do that because authors sign copyright agreement giving them exclusive rights to do so, see e.g. Danny Ruijters' answer which explains why authors need to publish in specific journals,
  3. these specific, prestigious journals won't leave publishers asking such fees because of inertia, which is in great part explained by the fact that publishers own the titles.

From the print era to the Internet era, scientific publishers shifted from a business whose main job was to distribute scientific works (which entailed composing, printing, shipping) to businesses whose main job is to prevent those who do not subscribe to read the articles they processed. When I say "main job", I don't say "only job", and I don't necessarily say its what they spend the most time at. I mean that it is the central part of their business model, in the sense that every other part could be made more or less poorly, but that part must be done right if they want to obtain any money.

Certainly, a job done warrants a payment, and publisher do many jobs we need (and many we don't, but that is not the point here). But as has been mentioned, payment could be organized differently, directly by the government for examples. After all, We need road builders to be paid, and yet we don't put tolls on every road, do we?

So, needing of payment cannot be the one answer to the question. Inertia, as mentioned by Danny Ruijters, is a large component of the answer. But one element that explains the level of this inertia is the following:

Most publishers own the copyright on the articles they processed, and possess the titles of the journals they run.

The first item means that a given article cannot be distributed by anyone else without explicit agreement from the publisher, and the second item means that even the editorial board of a journal cannot decide to move the journal to another publisher for any reason.

The few defections that occurred implied founding a new journal and hoping to make clear that the editorial board carried his editorial policy and prestige to the new journal. A recent example is given by Glossa, which was founded by the former editorial board of Lingua after they resigned (see e.g. here and there), after Elsevier refused to run Lingua according to the principles of "fair OA".

There are (and have been) many attempts at moving things toward open access, in various guises and business models, and the story is not over. But inertia is tremendous.


Why are most scientific articles locked behind a paywall?

Scientific articles can be found behind a paywall, but they are not locked there.

First, a lot of preprints, technical reports, extended versions are already available. Instead of only searching via Google, you can check the author's page (often containing preprints, codes), and a lot of archives and similar services: arxiv, biorxiv, hal, researchgate, academia and many others. Some older papers are quite often made available freely.

Second: most authors nicely share their preprint, version, etc. Just ask them (email for instance).

Third: in dire situations, there exist less open and legal options. Check for instance: How does LibGen/SciHub affect researchers' research and publishing process?

Last: if one is not 100% online, one can go to libraries, and xerox papers in written journals.