Reformatting a large number of XML files

This can be done from find directly using -exec:

find . -name "*.xml" -type f -exec xmllint --output '{}' --format '{}' \;

What's passed to -exec will be invoked once per file found with the template parameters {} being replaced with the current file name. The \; on the end of the find command just terminates the line.

The use of xargs isn't really necessary in this case because we need to invoke xmllint once per file as both the input and output file names must be specified within the same call.

xargs would be needed if the command being piped to from find was working on multiple files at a time and that list was long. You can't do that in this case, as you need to pass the single filename to the --output option of xmllint. Without xargs you could end up with a "Argument List too long" error if you are processing a lot of files. xargs also supports file replace strings with the -I option:

find . -name "*.xml" -type f | xargs -I'{}' xmllint --output '{}' --format '{}'

Would do the same as the find -exec command above. If any of your folders have odd chars in like spaces you will need to use the -0 options of find and xargs. But using xargs with -I implies the option -L 1 which means only process 1 file at a time anyway, so you may as well directly use find with -exec.


I typically attack these problems with a layer of indirection. Write a shell script that does what you want, and call that. I'd suggest as a start

#! /bin/sh
for file
do
   xmllint --format $file > $file.tmp && mv $file.tmp $file
done

The try it out on a file or two by hand, then you can replace it in the xargs

find . -name "*.xml" -type f | xargs -- xmltidy.sh

Tags:

Xml

Find

Xargs