How can I make a python dataclass hashable?

I'd like to add a special note for use of unsafe_hash.

You can exclude fields from being compared by hash by setting compare=False, or hash=False. (hash by default inherits from compare).

This might be useful if you store nodes in a graph but want to mark them visited without breaking their hashing (e.g if they're in a set of unvisited nodes..).

from dataclasses import dataclass, field
@dataclass(unsafe_hash=True)
class node:
    x:int
    visit_count: int = field(default=10, compare=False)  # hash inherits compare setting. So valid.
    # visit_count: int = field(default=False, hash=False)   # also valid. Arguably easier to read, but can break some compare code.
    # visit_count: int = False   # if mutated, hashing breaks. (3* printed)

s = set()
n = node(1)
s.add(n)
if n in s: print("1* n in s")
n.visit_count = 11
if n in s:
    print("2* n still in s")
else:
    print("3* n is lost to the void because hashing broke.")

This took me hours to figure out... Useful further readings I found is the python doc on dataclasses. Specifically see the field documentation and dataclass arg documentations. https://docs.python.org/3/library/dataclasses.html


From the docs:

Here are the rules governing implicit creation of a __hash__() method:

[...]

If eq and frozen are both true, by default dataclass() will generate a __hash__() method for you. If eq is true and frozen is false, __hash__() will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false, __hash__() will be left untouched meaning the __hash__() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).

Since you set eq=True and left frozen at the default (False), your dataclass is unhashable.

You have 3 options:

  • Set frozen=True (in addition to eq=True), which will make your class immutable and hashable.
  • Set unsafe_hash=True, which will create a __hash__ method but leave your class mutable, thus risking problems if an instance of your class is modified while stored in a dict or set:

    cat = Category('foo', 'bar')
    categories = {cat}
    cat.id = 'baz'
    
    print(cat in categories)  # False
    
  • Manually implement a __hash__ method.