Normalizing dictionary values

Try this to modify in place:

d={'a':0.2, 'b':0.3}
factor=1.0/sum(d.itervalues())
for k in d:
  d[k] = d[k]*factor

result:

>>> d
{'a': 0.4, 'b': 0.6}

Alternatively to modify into a new dictionary, use a dict comprehension:

d={'a':0.2, 'b':0.3}
factor=1.0/sum(d.itervalues())
normalised_d = {k: v*factor for k, v in d.iteritems() }

Note the use of d.iteritems() which uses less memory than d.items(), so is better for a large dictionary.

EDIT: Since there are quite a few of them, and getting this right seems to be important, I've summarised all the ideas in the comments to this answer together to the following (including borrowing something from this post):

import math
import operator

def really_safe_normalise_in_place(d):
    factor=1.0/math.fsum(d.itervalues())
    for k in d:
        d[k] = d[k]*factor
    key_for_max = max(d.iteritems(), key=operator.itemgetter(1))[0]
    diff = 1.0 - math.fsum(d.itervalues())
    #print "discrepancy = " + str(diff)
    d[key_for_max] += diff

d={v: v+1.0/v for v in xrange(1, 1000001)}
really_safe_normalise_in_place(d)
print math.fsum(d.itervalues())

Took a couple of goes to come up with dictionary that actually created a non-zero error when normalising but hope this illustrates the point.

EDIT: For Python 3.0. see the following change: Python 3.0 Wiki Built-in Changes

Remove dict.iteritems(), dict.iterkeys(), and dict.itervalues().

Instead: use dict.items(), dict.keys(), and dict.values() respectively.


def normalize(d, target=1.0):
   raw = sum(d.values())
   factor = target/raw
   return {key:value*factor for key,value in d.iteritems()}

Use it like this:

>>> data = {'a': 0.2, 'b': 0.3, 'c': 1.5}
>>> normalize(data)
{'b': 0.15, 'c': 0.75, 'a': 0.1}