How to convert a PDF to grayscale from command line avoiding to be rasterized?

A bit late in the day, but the top answer doesn't work for me with a different file. The underlying problem appears to be old code in Ghostscript, for which there is a later version that is not enabled by default. More on that here: http://bugs.ghostscript.com/show_bug.cgi?id=694608

The page above also gives a command that works for me:

gs \
  -sDEVICE=pdfwrite \
  -dProcessColorModel=/DeviceGray \
  -dColorConversionStrategy=/Gray \
  -dPDFUseOldCMS=false \
  -o out.pdf \
  -f in.pdf

If you crack into the file, you'll find that most of the colors are determined through an RGB ICC based color space (look for 8 0 R to find all the references to this colorspace). Perhaps gs is complaining about that?

Who knows.

The take away is that converting a page from one colorspace to another without affecting the content is non-trivial in that you need to be able to render the page and trap all changes to the current color/colorspace and substitute an equivalent in the target space as well as convert all image XObjects in the wrong colorspace, which will require decoding the image data and re-encoding it in the target space, as well as all form XObjects, which will be a task similar to trying to convert the parent page since form XObjects (I think your doc has 4) also contain resources and a content stream of page marking operators (which may include more XObjects).

It's certainly doable, but the process is nearly the same as rendering but with some fairly special-purpose code.


gs \
   -sDEVICE=pdfwrite \
   -sProcessColorModel=DeviceGray \
   -sColorConversionStrategy=Gray \
   -dOverrideICC \
   -o out.pdf \
   -f page-27.pdf

This command converts your file to grayscale (GS 9.10).


Use the most recent code (not yet released) and set ColorConversionStrategy=Gray