How to automatically extract cited documents from pdf to .bib

I've created a couple of bookmarklets to do this:

https://www.scholarcy.com/bookmarklets

They parse the PDF in the browser window and extract the citations as a .RIS or .bib file.

As such, it will work with any PDF, not just one on arXiv or that has a .tex source file

Alternatively, you can download the Scholarcy Chrome Extension which will also link each entry to the open-access version of the PDF for each cited paper (if there is one)

Full disclosure: I developed this tool and am the founder of Scholarcy.


You can use the SAO/NASA Astrophysics Data System: http://adsabs.harvard.edu/abstract_service.html. It indexes several scientific journals, mostly in the fields of astronomy/astrophysics, including all e-prints on arXiv.

For example, go to the page of this e-print: https://arxiv.org/abs/1704.00684, click on the "NASA ADS" button in the box on the right and you'll be directed to http://adsabs.harvard.edu/cgi-bin/bib_query?arXiv:1704.00684. In this page, click on the "References in the Article" button and you'll get the list of all references cited in the paper. Click on the bibcode of one of them and in the abstract page you can click on the "Bibtex entry for this abstract" to automatically obtain the BibTeX entry for it.

At https://ui.adsabs.harvard.edu/ there is a fancier version of SAO/NASA ADS, with a more modern user interface.