Using jq or alternative command line tools to compare JSON files

Here is a solution using the generic function walk/1:

# Apply f to composite entities recursively, and to atoms
def walk(f):
  . as $in
  | if type == "object" then
      reduce keys[] as $key
        ( {}; . + { ($key):  ($in[$key] | walk(f)) } ) | f
  elif type == "array" then map( walk(f) ) | f
  else f
  end;

def normalize: walk(if type == "array" then sort else . end);

# Test whether the input and argument are equivalent
# in the sense that ordering within lists is immaterial:
def equiv(x): normalize == (x | normalize);

Example:

{"a":[1,2,[3,4]]} | equiv( {"a": [[4,3], 2,1]} )

produces:

true

And wrapped up as a bash script:

#!/bin/bash

JQ=/usr/local/bin/jq
BN=$(basename $0)

function help {
  cat <<EOF

Syntax: $0 file1 file2

The two files are assumed each to contain one JSON entity.  This
script reports whether the two entities are equivalent in the sense
that their normalized values are equal, where normalization of all
component arrays is achieved by recursively sorting them, innermost first.

This script assumes that the jq of interest is $JQ if it exists and
otherwise that it is on the PATH.

EOF
  exit
}

if [ ! -x "$JQ" ] ; then JQ=jq ; fi

function die     { echo "$BN: $@" >&2 ; exit 1 ; }

if [ $# != 2 -o "$1" = -h  -o "$1" = --help ] ; then help ; exit ; fi

test -f "$1" || die "unable to find $1"
test -f "$2" || die "unable to find $2"

$JQ -r -n --argfile A "$1" --argfile B "$2" -f <(cat<<"EOF"
# Apply f to composite entities recursively, and to atoms
def walk(f):
  . as $in
  | if type == "object" then
      reduce keys[] as $key
        ( {}; . + { ($key):  ($in[$key] | walk(f)) } ) | f
  elif type == "array" then map( walk(f) ) | f
  else f
  end;

def normalize: walk(if type == "array" then sort else . end);

# Test whether the input and argument are equivalent
# in the sense that ordering within lists is immaterial:
def equiv(x): normalize == (x | normalize);

if $A | equiv($B) then empty else "\($A) is not equivalent to \($B)" end

EOF
)

POSTSCRIPT: walk/1 is a built-in in versions of jq > 1.5, and can therefore be omitted if your jq includes it, but there is no harm in including it redundantly in a jq script.

POST-POSTSCRIPT: The builtin version of walk has recently been changed so that it no longer sorts the keys within an object. Specifically, it uses keys_unsorted. For the task at hand, the version using keys should be used.


Use jd with the -set option:

No output means no difference.

$ jd -set A.json B.json

Differences are shown as an @ path and + or -.

$ jd -set A.json C.json

@ ["People",{}]
+ "Carla"

The output diffs can also be used as patch files with the -p option.

$ jd -set -o patch A.json C.json; jd -set -p patch B.json

{"City":"Boston","People":["John","Carla","Bryan"],"State":"MA"}

https://github.com/josephburnett/jd#command-line-usage


If your shell supports process substitution (Bash-style follows, see docs):

diff <(jq --sort-keys . A.json) <(jq --sort-keys . B.json)

Objects key order will be ignored, but array order will still matter. It is possible to work-around that, if desired, by sorting array values in some other way, or making them set-like (e.g. ["foo", "bar"]{"foo": null, "bar": null}; this will also remove duplicates).

Alternatively, substitute diff for some other comparator, e.g. cmp, colordiff, or vimdiff, depending on your needs. If all you want is a yes or no answer, consider using cmp and passing --compact-output to jq to not format the output for a potential small performance increase.


Since jq's comparison already compares objects without taking into account key ordering, all that's left is to sort all lists inside the object before comparing them. Assuming your two files are named a.json and b.json, on the latest jq nightly:

jq --argfile a a.json --argfile b b.json -n '($a | (.. | arrays) |= sort) as $a | ($b | (.. | arrays) |= sort) as $b | $a == $b'

This program should return "true" or "false" depending on whether or not the objects are equal using the definition of equality you ask for.

EDIT: The (.. | arrays) |= sort construct doesn't actually work as expected on some edge cases. This GitHub issue explains why and provides some alternatives, such as:

def post_recurse(f): def r: (f | select(. != null) | r), .; r; def post_recurse: post_recurse(.[]?); (post_recurse | arrays) |= sort

Applied to the jq invocation above:

jq --argfile a a.json --argfile b b.json -n 'def post_recurse(f): def r: (f | select(. != null) | r), .; r; def post_recurse: post_recurse(.[]?); ($a | (post_recurse | arrays) |= sort) as $a | ($b | (post_recurse | arrays) |= sort) as $b | $a == $b'

Tags:

Json

Diff

Jq