Tool to automatically expand YAML merges?

UPDATE: 2019-03-13 12:41:05

  • This answer was modified pursuant to a comment by Anthon which correctly identified limitations with PyYAML. (See Pitfalls infra).

Context

  • YAML file
  • Python for parsing the YAML

Problem

  • User jtYamlEnthusiast wishes to output a non-DRY version of a YAML file with aliases, anchors, and merge keys.

Solution(s)

  • Alternative 1: use the ruamel library promoted by Anthon infra.
  • Alternative 2: use Python pprint.pformat and simply do a load/dump round-trip transformation.

Rationale

  • the ruamel library is great if you have the discretion to install another python library besides pyyaml, and you want a high degree of control over "round-trip" YAML transformations (such as the preservation of YAML comments, for example).
  • if you do not need rigorous control over round-tripped YAML, or you are limited for some other reason to pyyaml, you can simply load and dump YAML directly, in order to obtain the "non-DRY" output.

Pitfalls

  • as of this writing PyYAML has limitations relative to the ruamel library, regarding the handling of YAML v1.1 and YAML v1.2

  • See also

    • ruamel docs
    • pyyaml repo

Example

    ##
    import pprint
    import yaml
    ##
    myrawyaml = '''
    default: &DEFAULT
      URL: website.com
      mode: production
      site_name: Website
      some_setting: h2i8yiuhef
      some_other_setting: 3600

    development:
      <<: *DEFAULT
      URL: website.local
      mode: dev

    test:
      <<: *DEFAULT
      URL: test.website.qa
      mode: test
    '''
    ##
    pynative  =   yaml.safe_load(myrawyaml)
    vout      =   pprint.pformat(pynative)
    print(vout)                             ##=> this is non-DRY and just happens to be well-formed YAML syntax
    print(yaml.safe_load(vout))             ##=> this proves we have well-formed YAML if it loads without exception

If you have python installed on your system, you can do pip install ruamel.yaml.cmd¹ and then:

yaml merge-expand input.yaml output.yaml

(replace output.yaml with - to write to stdout). This implements the merge expanding with preservation of key order and comments.

The above is actually a few lines of code that utilizes ruamel.yaml¹ so if you have Python (2.7 or 3.4+) and install that using pip install ruamel.yaml and save the following as expand.py:

import sys
from ruamel.yaml import YAML

yaml = YAML(typ='safe')
yaml.default_flow_style=False
with open(sys.argv[1]) as fp:
    data = yaml.load(fp)
with open(sys.argv[2], 'w') as fp:
    yaml.dump(data, fp)

you can already do:

python expand.py input.yaml output.yaml

That will get you YAML that is semantically equivalent to what you requested (in output.yaml the keys of the mappings are sorted, in this programs output they are not).

The above assumes you don't have any tags in your YAML, nor care about preserving any comments. Most of those, and the key ordering, can be preserved by using a patched version of the standard YAML() instance. Patching is necessary because the standard YAML() instance preserves the merges on round-trip as well, which is exactly what you don't want:

import sys
from ruamel.yaml import YAML, SafeConstructor

yaml = YAML()

yaml.Constructor.flatten_mapping = SafeConstructor.flatten_mapping
yaml.default_flow_style=False
yaml.allow_duplicate_keys = True
# comment out next line if you want "normal" anchors/aliases in your output
yaml.representer.ignore_aliases = lambda x: True  

with open(sys.argv[1]) as fp:
    data = yaml.load(fp)
with open(sys.argv[2], 'w') as fp:
    yaml.dump(data, fp)

with this input:

default: &DEFAULT
  URL: website.com
  mode: production
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600  # an hour?

development:
  <<: *DEFAULT
  URL: website.local     # local web
  mode: dev

test:
  <<: *DEFAULT
  URL: test.website.qa
  mode: test

that will give this output (note that comments on the merged in keys get duplicated):

default:
  URL: website.com
  mode: production
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600  # an hour?

development:
  URL: website.local     # local web
  mode: dev

  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600  # an hour?

test:
  URL: test.website.qa
  mode: test
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600  # an hour?

The above is what the yaml merge-expand command, mentioned at the start of this answer, does.


¹ Disclaimer: I am the author of that package.