How are scientific papers uniquely identified?

Research publications predate the digital age, so only a fraction has a digital identifier. The unique identifier traditionally used is the full citation, of which various formats exist to suit discipline-specific needs. It is very likely for a full citation to be unique.

(However, automatic data analysis may not be able to recognize the identity of diffently formated citations of the same publication. That's DOI's strong suit.)


I work at Crossref and we run a registry of DOIs with attached metadata beyond such as funder acknowledgements, whether it's been retracted, etc. Our DOIs (about 100 million) are specifically citation identifiers and also persistent links. DOI is also the ISO standard for identifying research publications. 11,000 publishers use Crossref but yes they have not all gone back to their print archives to digitise them and assign DOIs yet, books especially are lagging behind. It integrates with DataCite DOIs for data and software citations (about 10 million of them), and with ORCID iDs which identify authors & contributors (about 5 million so far). There is also ROR.community starting up to uniquely identify research institutions. All these are open community-governed nonprofit organizations. ArXiv is pretty much the only publisher that doesn't (yet) use DOIs. Centre for Open Science does for all the other 'Xivs', as does BioRxiV etc.


How are all scientific publications uniquely identified ?

Formally, they aren't: There is no system to uniquely identify publications. But, author(s), year of publication, and title are typically sufficient to uniquely identify publications, because authors typically publish different works with different titles and it is unlikely that distinct author(s) with the same name(s) will publish in the same year with the same title. Moreover, it is even more unlikely once publication venue is added and distinct once page numbers are added (assuming no two publication venues share the same name in the same year). So, a full citation should suffice to uniquely identify publications, but there may exist (with low probability) publications that cannot be uniquely identified.

(Uniquely identifying authors of a publication is more problematic.)