Split pages in pdf

Just an addition since I had issues with the python script (and several other solutions): for me mutool worked great. It's a simple and small addition shipped with the elegant mupdf reader. So you can try:

mutool poster -y 2 input.pdf output.pdf

For horizontal splits, replace y with x. And you can, of course, combine the two for more complex solutions.

Really happy to have found this (after years of daily mupdf usage :)


Installing mupdf and mutool from source

(mutool comes shipped with mupdf starting from version 1.4: http://www.mupdf.com/news)

wget http://www.mupdf.com/downloads/mupdf-1.8-source.tar.gz
tar -xvf mupdf-1.8-source.tar.gz
cd mupdf-1.8-source
sudo make prefix=/usr/local install

Or go to the downloads page to find a newer version.

Installing mutool from a Linux distribution package

On Debian, the package containing mutool is mupdf-tools:

apt-get install mupdf-tools

Here's a small Python script using the old PyPdf library that does the job neatly. Save it in a script called un2up (or whatever you like), make it executable (chmod +x un2up), and run it as a filter (un2up <2up.pdf >1up.pdf).

#!/usr/bin/env python
import copy, sys
from pyPdf import PdfFileWriter, PdfFileReader
input = PdfFileReader(sys.stdin)
output = PdfFileWriter()
for p in [input.getPage(i) for i in range(0,input.getNumPages())]:
    q = copy.copy(p)
    (w, h) = p.mediaBox.upperRight
    p.mediaBox.upperRight = (w/2, h)
    q.mediaBox.upperLeft = (w/2, h)
    output.addPage(p)
    output.addPage(q)
output.write(sys.stdout)

Ignore any deprecation warnings; only the PyPdf maintainers need be concerned with those.

If the input is oriented in an unusual way, you may need to use different coordinates when truncating the pages. See Why my code not correctly split every page in a scanned pdf?


Just in case it's useful, here's my earlier answer which uses a combination of two tools plus some manual intervention:

  • Pdfjam (at least version 2.0), based on the pdfpages LaTeX package, to crop the pages;
  • Pdftk, to put the left and right halves back together.

Both tools are needed because as far as I can tell pdfpages isn't able to apply two different transformations to the same page in one stream. In the call to pdftk, replace 42 by the number of pages in the input document (2up.pdf).

pdfjam -o odd.pdf --trim '0cm 0cm 14.85cm 0cm' --scale 1.141 2up.pdf
pdfjam -o even.pdf --trim '14.85cm 0cm 0cm 0cm' --scale 1.141 2up.pdf
pdftk O=odd.pdf E=even.pdf cat $(i=1; while [ $i -le 42 ]; do echo O$i E$i; i=$(($i+1)); done) output all.pdf

In case you don't have pdfjam 2.0, it's enough to have a PDFLaTeX installation with the pdfpages package (on Ubuntu: you need texlive-latex-recommended Install texlive-latex-recommended and perhaps (on Ubuntu: texlive-fonts-recommended Install texlive-fonts-recommended), and use the following driver file driver.tex:

\batchmode
\documentclass{minimal}
\usepackage{pdfpages}
\begin{document}
\includepdfmerge[trim=0cm 0cm 14.85cm 0cm,scale=1.141]{2up.pdf,-}
\includepdfmerge[trim=14.85cm 0cm 0cm 0cm,scale=1.141]{2up.pdf,-}
\end{document}

Then run the following commands, replacing 42 by the number of pages in the input file (which must be called 2up.pdf):

pdflatex driver
pdftk driver.pdf cat $(i=1; pages=42; while [ $i -le $pages ]; do echo $i $(($pages+$i)); i=$(($i+1)); done) output 1up.pdf

Imagemagick can do it in one step:

$ convert in.pdf -crop 50%x0 +repage out.pdf