Rsync filter: copying one pattern only

TL,DR:

rsync -am --include='*.pdf' --include='*/' --exclude='*' ~/LaTeX/ ~/Output/

Rsync copies the source(s) to the destination. If you pass *.pdf as sources, the shell expands this to the list of files with the .pdf extension in the current directory. No recursive traversal happens because you didn't pass any directory as a source.

So you need to run rsync -a ~/LaTeX/ ~/Output/, but with a filter to tell rsync to copy .pdf files only. Rsync's filter rules can seem daunting when you read the manual, but you can construct many examples with just a few simple rules.

  • Inclusions and exclusions:

    • Excluding files by name or by location is easy: --exclude=*~, --exclude=/some/relative/location (relative to the source argument, e.g. this excludes ~/LaTeX/some/relative/location).
    • If you only want to match a few files or locations, include them, include every directory leading to them (for example with --include=*/), then exclude the rest with --exclude='*'. This is because:
    • If you exclude a directory, this excludes everything below it. The excluded files won't be considered at all.
    • If you include a directory, this doesn't automatically include its contents. In recent versions, --include='directory/***' will do that.
    • For each file, the first matching rule applies (and anything never matched is included).
  • Patterns:

    • If a pattern doesn't contain a /, it applies to the file name sans directory.
    • If a pattern ends with /, it applies to directories only.
    • If a pattern starts with /, it applies to the whole path from the directory that was passed as an argument to rsync.
    • * any substring of a single directory component (i.e. never matches /); ** matches any path substring.
  • If a source argument ends with a /, its contents are copied (rsync -r a/ b creates b/foo for every a/foo). Otherwise the directory itself is copied (rsync -r a b creates b/a).


Thus here we need to include *.pdf, include directories containing them, and exclude everything else.

rsync -a --include='*.pdf' --include='*/' --exclude='*' ~/LaTeX/ ~/Output/

Note that this copies all directories, even the ones that contain no matching file or subdirectory containing one. This can be avoided with the --prune-empty-dirs option (it's not a universal solution since you then can't copy a directory even by matching it explicitly, but that's a rare requirement).

rsync -am --include='*.pdf' --include='*/' --exclude='*' ~/LaTeX/ ~/Output/

rsync -av --include="*/" --include="*.pdf" --exclude="*" ~/Latex/ ~/Output/ --dry-run

The default is to include everything, so you must explicitly exclude everything after including the files you want to transfer. Remove the --dry-run to actually transfer the files.

If you start off with:

--exclude '*' --include '*.pdf'

Then the greedy matching will exclude everything right off.

If you try:

--include '*.pdf' --exclude '*' 

Then only pdf files in the top level folder will be transferred. It won't follow any directories, since those are excluded by '*'.


If you use a pattern like *.pdf, the shell “expands“ that pattern, i.e. it replaces the pattern with all matches in the current directory. The command you are running (in this case rsync) is unaware of the fact that you tried to use a pattern.

When you are using zsh, there is an easy solution, though: The ** pattern can be used to match folders recursively. Try this:

rsync -avn ~/LaTeX/**/*.pdf ~/Output/