How to create small PDF files for the Internet

There are a number of tricks for getting optimized pdfs. Many of them are implemented in the tool pdfsizeopt. With some patches (posted in the pdfsizeopt bugtracker) this tool can run on all my tex-generated pdfs (and nearly all of the non-tex-generated ones). I use the commandline:

python ./pdfsizeopt.py --use-pngout=true --use-jbig2=true --use-multivalent=true --do-unify-fonts=false filetocompress.pdf

I use --do-unify-fonts=false even though it produces slightly larger pdfs, because of a bug where a few glyphs are not displayed with certain pdf viewers (windows adobe reader, for example).

There are indeed various things you can do during document production with tex, to make sure that the compressed pdf ends up as small as possible: several of these are discussed in the EuroTeX 2009 White paper about pdfsizeopt (available at https://github.com/pts/pdfsizeopt/releases/download/docs-v1/pts_pdfsizeopt2009.psom.pdf).

As regards fonts, pdfsizeopt will recode fonts to the very compressed CFF format, and take care of subsetting and duplication issues. I haven't investigated deeply, but in my tests it seems that of the 2 options for type 1 encoded T1 (multilingual) tex fonts, the Latin Modern fonts generally produce significantly larger PDFs than the CM-Super version (which is unfortunate, because Latin Modern is superior in just about every other way (see this question). I just did a quick experiment and this difference in size seems to be only for the pre-pdfsizeopt pdfs: after pdfsizeopt, Latin Modern is the same or smaller than CM-Super.

Using fonts that don't have optical scaling will indeed produce a smaller PDF, but I don't recommend it because if you are using multiple sizes then the non-optically scaled fonts will look much worse.


If for some reason you don't want to use pdfsizeopt: both XeTeX and LuaTeX typically generate smaller PDF files than pdfTeX because OpenType fonts are already encoded in either CFF or TrueType outlines.


There is the program pdfopt provided by Ghostscript which converts the PDF in the official web optimised format. This (quote from man pdfopt) puts the elements of the file into a more linear order and adds "hint" pointers, allowing Adobe's Acrobat(TM) products to display individual pages of the file more quickly when accessing the file through a network (unquote).

The usage is straigt forward:

pdfopt [ options ] input.pdf output.pdf

Just make sure that both PDF filenames are not the same. You might want to move output.pdf to input.pdf afterwards. This is what I do in my Makefile's for my LaTeX package prior to uploading them to CTAN.