Python: How to remove all empty fields in a nested dict

Use a recursive function that returns a new dictionary:

def clean_empty(d):
    if isinstance(d, dict):
        return {
            k: v 
            for k, v in ((k, clean_empty(v)) for k, v in d.items())
            if v
        }
    if isinstance(d, list):
        return [v for v in map(clean_empty, d) if v]
    return d

The {..} construct is a dictionary comprehension; it'll only include keys from the original dictionary if v is true, e.g. not empty. Similarly the [..] construct builds a list.

The nested (.. for ..) construct is a generator expression that allows the code to compactly filter empty objects after recursing.

Another way of constructing such a function is to use the @singledispatch decorator; you then write multiple functions, one per object type:

from functools import singledispatch

@singledispatch
def clean_empty(obj):
    return obj

@clean_empty.register
def _dicts(d: dict):
    items = ((k, clean_empty(v)) for k, v in d.items())
    return {k: v for k, v in items if v}

@clean_empty.register
def _lists(l: list):
    items = map(clean_empty, l)
    return [v for v in items if v]

The above @singledispatch version does exactly the same thing as the first function but the isinstance() tests are now taken care of by the decorator implementation, based on the type annotations of the registered functions. I also put the nested iterators (the generator expression and map() function) into a separate variable to improve readability further.

Note that any values set to numeric 0 (integer 0, float 0.0) will also be cleared. You can retain numeric 0 values with if v or v == 0.

Demo of the first function:

>>> sample = {
...     "fruit": [
...         {"apple": 1},
...         {"banana": None}
...     ],
...     "veg": [],
...     "result": {
...         "apple": 1,
...         "banana": None
...     }
... }
>>> def clean_empty(d):
...     if isinstance(d, dict):
...         return {
...             k: v
...             for k, v in ((k, clean_empty(v)) for k, v in d.items())
...             if v
...         }
...     if isinstance(d, list):
...         return [v for v in map(clean_empty, d) if v]
...     return d
... 
>>> clean_empty(sample)
{'fruit': [{'apple': 1}], 'result': {'apple': 1}}

If you want a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles and other kinds of containers, I recommend looking at the remap utility from the boltons utility package.

After pip install boltons or copying iterutils.py into your project, just do:

from boltons.iterutils import remap

data = {'veg': [], 'fruit': [{'apple': 1}, {'banana': None}], 'result': {'apple': 1, 'banana': None}}

drop_falsey = lambda path, key, value: bool(value)
clean = remap(data, visit=drop_falsey)
print(clean)

# Output:
{'fruit': [{'apple': 1}], 'result': {'apple': 1}}

This page has many more examples, including ones working with much larger objects from Github's API.

It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.

Tags:

Python