How to install PDFBox on Windows so it works with pax?

Unless you have an unpacked PDFBox in C:\PDFBox, the CLASSPATH is wrong. Instead of the directory, the .jar file is needed: C:\PDFBox\PDFBox-0.7.3.jar.

Neither C:\PDFBox\ nor C:\MiKTeX\scripts\pax\ need to be added to the system Path variable.

The spaces in the argument for java's option -cp should not be a problem, because the Perl script uses the array form of function system. But it can be tested:

java -cp "C:\MiKTeX\scripts\pax\pax.jar;C:\PDFBox\PDFBox-0.7.3.jar" pax.PDFAnnotExtractor FileWithBookmarks.pdf

Remarks:

  • In Linux/Unix the path separator : is used instead of ;.
  • Project pax does not support newer versions of PDFBox. The supported versions are 0.7.2 and 0.7.3.

In addition to Heiko’s answer and just for convenience (Windows only):

Create a file pax.bat (or pax.cmd or what ever you prefer instead of pax) under the bin subfolder of your local texmf tree. Under MiKTeX you perhaps first need to create one: Create a local texmf tree in MiKTeX.

Now the preferred variant: Executing the perl file (an installation of a Perl distribution is necessary):

Edit pax.bat, adjust paths to your settings

@echo off
SETLOCAL

set CLASSPATH=C:\PDFBox\lib\PDFBox-0.7.3.jar;%CLASSPATH%

perl C:\MiKTeX\scripts\pax\pdfannotextractor.pl %*

You could even leave out the set CLASSPATH line, if you’d create a path <localtexmf>\scripts\pax\lib, put PDFBox-0.7.3.jar in it and refresh the filename database (fndb).

Then on the Command Prompt you can call pax FileWithBookmarks.pdf or pax --debug FileWithBookmarks.pdf > paxdebug.log. This assumes, that there is no other pax.exe or similar on the system path, otherwise always make your call with pax.bat ....

Executing java directly is a bit more complicated:

Again edit pax.bat and adjust paths to your settings

@echo off
SETLOCAL

set CLASSPATH=C:\PDFBox\lib\PDFBox-0.7.3.jar;C:\MiKTeX\scripts\pax\pax.jar;%CLASSPATH%

java pax.PDFAnnotExtractor %*

Note that pax.jar was added to the classpath. I prefer to set the environment variable CLASSPATH, but the command line option -classpath, or short -cp, works as well, as shown by Heiko.