How to compare XML files

Have a look at Using XSLT to Assist Regression Testing that describe a solution using xslt


For what it's worth, I have created a java tool (or kotlin actually) for effecient and configurable canonicalization of xml files.

It will always:

  • Sort nodes and attributes by name.
  • Remove namespaces (yes - it could - hypothetically - be a problem).
  • Prettyprint the result.

In addition you can tell it to:

  • Remove a given list of node names - maybe you do not want to know that the value of a piece of metadata - say <RequestReceivedTimestamp> has changed.
  • Sort a given list of collections in the context of the parent - maybe you do not care that the order of <Contact> entries in <ListOfFavourites> has changed.

It uses XSLT and does all the above efficiently using chaining.

Limitations

It does support sorting nested lists - sorting innermost lists before outer. But it cannot reliably sort arbitrary levels of recursively nested lists.

If you have such needs you can - after having used this tool - compare the sorted byte arrays of the results. they will be equal if only list sorting issues remain.

Where to get it

You can get it here: XMLNormalize


I had a similar problem and I eventually found: http://superuser.com/questions/79920/how-can-i-diff-two-xml-files

That post suggests doing a canonical XML sort then doing a diff. The following should work for you if you are on Linux, Mac, or if you have Windows with something like Cygwin installed:

$ xmllint --c14n FileA.xml > 1.xml
$ xmllint --c14n FileB.xml > 2.xml
$ diff 1.xml 2.xml

Tags:

Xml

Diff