Maintained alternatives to PyPDF2

Update: PyPDF2 is maintained again - and I am the maintainer :-) I've just released a new version with several bugfixes.


Three potential alternatives which are maintained (just like PyPDF2):

  • pymupdf: uses mupdf (only for open source due to mypdf license)
  • pikepdf: Uses qpdf
  • pdfminer.six: A pure Python project.

I would not use:

  • PyPDF3 (pypi): Has less activity and probably less features than PyPDF2.
  • PyPDF4 (pypi): Last release on PyPI in 2018

PyMuPDF is a Python binding for MuPDF – a lightweight PDF and XPS viewer. Because MuPDF supports not only PDF but also XPS, OpenXPS, CBZ, CBR, FB2, and EPUB formats, so does PyMuPDF. PyMuPDF is hosted on GitHub. We also are registered on PyPI.

Its performance stats are also very promising. Following are three sections that deal with different aspects of performance:

  • document parsing
  • text extraction
  • image rendering

PyMuPDF is the faster than pdfrw, PyPDF2, and pdftk.

Tags:

Python

Pdf

Pypdf2