Hashing an immutable dictionary in Python

Since the dictionary is immutable, you can create the hash when the dictionary is created and return it directly. My suggestion would be to create a frozenset from items (in 3+; iteritems in 2.7), hash it, and store the hash.

To provide an explicit example:

>>>> frozenset(Counter([1, 1, 1, 2, 3, 3, 4]).iteritems())
frozenset([(3, 2), (1, 3), (4, 1), (2, 1)])
>>>> hash(frozenset(Counter([1, 1, 1, 2, 3, 3, 4]).iteritems()))
-3071743570178645657
>>>> hash(frozenset(Counter([1, 1, 1, 2, 3, 4]).iteritems()))
-6559486438209652990

To clarify why I prefer a frozenset to a tuple of sorted items: a frozenset doesn't have to sort the items, and so the initial hash completes in O(n) time rather than O(n log n) time. This can be seen from the frozenset_hash and set_next implementations.

See also this great answer from Raymond Hettinger describing his implementation of the frozenset hash function. There he explicitly explains how the hash function avoids having to sort values to get a stable, order insensitive value.

Have you considered hash(sorted(hash(x) for x in self.items()))? That way, you are only sorting integers, and don't have to build a list.

You could also xor the element hashes together, but frankly I don't how well that would work (would you have a lot of collisions?). Speaking of collisions, don't you have to implement the __eq__ method?

Alternatively, similar to my answer here, hash(frozenset(self.items())).

Hashing an immutable dictionary in Python

Tags:

Python

Dictionary

Hash

Immutability

Python 3.2

Related

Recent Posts