PDF file renaming according to metadata?

This is very easy to achieve with exiftool.

For instance, the following command would rename all files in the current directory to <title>.extension:

exiftool '-filename<$title.%e' .

You can install exiftool on Ubuntu with:

sudo apt-get install libimage-exiftool-perl

Please consult the official documentation for more information:

http://www.sno.phy.queensu.ca/~phil/exiftool/filename.html


If you are comfortable with python you could use the script on http://blog.matt-swain.com/post/25650072381/a-lightweight-xmp-parser-for-extracting-pdf-metadata-in. I have just tested the scripts he provides (for a start, you can pip install pdfminer) and they work nicely. The result they give is something along the lines of:

[{'ModDate': "D:20050422142709+02'00'", 'CreationDate': "D:20050422142709+02'00'", 'Producer': 'Mac OS X 10.3.8 Quartz PDFContext', 'Creator': 'Word'}]

That output you could use to rename your files.


There is another alternative. You could sudo apt-get install pdftk. With that library you can run a command like pdftk myfile.pdf dump_data which results in something in a set of info and value:

InfoKey: Creator
InfoValue: Word
InfoKey: Producer
InfoValue: Mac OS X 10.3.8 Quartz PDFContext
InfoKey: ModDate
InfoValue: D:20050422142709+02'00'
InfoKey: CreationDate
InfoValue: D:20050422142709+02'00'
PdfID0: d7af25c8df737276d8d6b5de49d94d92
PdfID1: d7af25c8df737276d8d6b5de49d94d92
NumberOfPages: 58

Again you could use that information in a renaming script. If feel the latter is something best customized because it depends on whether you just want the title, title-author, or something else.

Source

Tags:

Pdf

Document