Parsing Yaml in Python: Detect duplicated keys

The yamllint command-line tool does what you want:

sudo pip install yamllint

Specifically, it has a rule key-duplicates that detects repetitions and keys over-writing one another:

$ yamllint test.yaml
test.yaml
  1:1       warning  missing document start "---"  (document-start)
  10:5      error    duplication of key "subkey5" in mapping  (key-duplicates)

(It has many other rules that you can enable/disable or tweak.)


Over-riding on of the build in loaders is a more lightweight approach:

 import yaml
 # special loader with duplicate key checking
 class UniqueKeyLoader(yaml.SafeLoader):
     def construct_mapping(self, node, deep=False):
         mapping = []
         for key_node, value_node in node.value:
             key = self.construct_object(key_node, deep=deep)
             assert key not in mapping
             mapping.append(key)
         return super().construct_mapping(node, deep)

then:

 yaml_text = open(filename), 'r').read()
 data[f] = yaml.load(yaml_text, Loader=UniqueKeyLoader)