How can I diff two XML files?

One approach would be to first turn both XML files into Canonical XML, and compare the results using diff. For example, xmllint can be used to canonicalize XML.

$ xmllint --c14n one.xml > 1.xml
$ xmllint --c14n two.xml > 2.xml
$ diff 1.xml 2.xml

Or as a one-liner.

$ diff <(xmllint --c14n one.xml) <(xmllint --c14n two.xml)

Jukka's answer did not work for me, but it did point to Canonical XML. Neither --c14n nor --c14n11 sorted the attributes, but i did find the --exc-c14n switch did sort the attributes. --exc-c14n is not listed in the man page, but described on the command line as "W3C exclusive canonical format".

$ xmllint --exc-c14n one.xml > 1.xml
$ xmllint --exc-c14n two.xml > 2.xml
$ diff 1.xml 2.xml

$ xmllint | grep c14
    --c14n : save in W3C canonical format v1.0 (with comments)
    --c14n11 : save in W3C canonical format v1.1 (with comments)
    --exc-c14n : save in W3C exclusive canonical format (with comments)

$ rpm -qf /usr/bin/xmllint
libxml2-2.7.6-14.el6.x86_64
libxml2-2.7.6-14.el6.i686

$ cat /etc/system-release
CentOS release 6.5 (Final)

Warning --exc-c14n strips out the xml header whereas the --c14n prepends the xml header if not there.


Tried to use @Jukka Matilainen's answer but had problems with white-space (one of the files was a huge one-liner). Using --format helps to skip white-space differences.

xmllint --format one.xml > 1.xml  
xmllint --format two.xml > 2.xml  
diff 1.xml 2.xml  

Note: Use vimdiff command for side-by-side comparison of the xmls.

Tags:

Linux

Xml

Diff