Validating input when mutating a dataclass

Perhaps lock down the attribute using getters and setters instead of mutating the attribute directly. If you then extract your validation logic into a separate method, you can validate the same way from both your setter and the __post_init__ function.


A simple and flexible solution can be to override the__setattr__ method:

@dataclass
class Person:
    name: str
    age: float

    def __setattr__(self, name, value):
        if name == 'age':
            assert value > 0, f"value of {name} can't be negative: {value}"
        self.__dict__[name] = value

Dataclasses are a mechanism to provide a default initialization to accept the attributes as parameters, and a nice representation, plus some niceties like the __post_init__ hook.

Fortunatelly, they do not mess with any other mechanism for attribute access in Python - and you can still have your dataclassess attributes being created as property descriptors, or a custom descriptor class if you want. In that way, any attribute access will go through your getter and setter functions automatically.

The only drawback for using the default property built-in is that you have to use it in the "old way", and not with the decorator syntax - that allows you to create annotations for your attributes.

So, "descriptors" are special objects assigned to class attributes in Python in a way that any access to that attribute will call the descriptors __get__, __set__ or __del__ methods. The property built-in is a convenince to build a descriptor passed 1 to 3 functions taht will be called from those methods.

So, with no custom descriptor-thing, you could do:

@dataclass
class MyClass:
   def setname(self, value):
       if not isinstance(value, str):
           raise TypeError(...)
       self.__dict__["name"] = value
   def getname(self):
       return self.__dict__.get("name")
   name: str = property(getname, setname)
   # optionally, you can delete the getter and setter from the class body:
   del setname, getname

By using this approach you will have to write each attribute's access as two methods/functions, but will no longer need to write your __post_init__: each attribute will validate itself.

Also note that this example took the little usual approach of storing the attributes normally in the instance's __dict__. In the examples around the web, the practice is to use normal attribute access, but prepending the name with a _. This will leave these attributes polluting a dir on your final instance, and the private attributes will be unguarded.

Another approach is to write your own descriptor class, and let it check the instance and other properties of the attributes you want to guard. This can be as sofisticated as you want, culminating with your own framework. So for a descriptor class that will check for attribute type and accept a validator-list, you will need:

def positive_validator(name, value):
    if value <= 0:
        raise ValueError(f"values for {name!r}  have to be positive")

class MyAttr:
     def __init__(self, type, validators=()):
          self.type = type
          self.validators = validators

     def __set_name__(self, owner, name):
          self.name = name

     def __get__(self, instance, owner):
          if not instance: return self
          return instance.__dict__[self.name]

     def __delete__(self, instance):
          del instance.__dict__[self.name]

     def __set__(self, instance, value):
          if not isinstance(value, self.type):
                raise TypeError(f"{self.name!r} values must be of type {self.type!r}")
          for validator in self.validators:
               validator(self.name, value)
          instance.__dict__[self.name] = value

#And now

@dataclass
class Person:
    name: str = MyAttr(str)
    age: float = MyAttr((int, float), [positive_validator,])

That is it - creating your own descriptor class requires a bit more knowledge about Python, but the code given above should be good for use, even in production - you are welcome to use it.

Note that you could easily add a lot of other checks and transforms for each of your attributes - and the code in __set_name__ itself could be changed to introspect the __annotations__ in the owner class to automatically take note of the types - so that the type parameter would not be needed for the MyAttr class itself. But as I said before: you can make this as sophisticated as you want.


The answer provided by @jsbueno is great, but it doesn't allow for default arguments. I expanded it to allow defaults:

def positive_validator(name, value):
    if value <= 0:
        raise ValueError(f"values for {name!r}  have to be positive")

class MyAttr:
    def __init__(self, typ, validators=(), default=None):
        if not isinstance(typ, type):
            if isinstance(typ, tuple) and all([isinstance(t,type) for t in typ]):
                pass
            else:
                raise TypeError(f"'typ' must be a {type(type)!r} or {type(tuple())!r}` of {type(type)!r}")
        else:
            typ=(typ,)
        self.type = typ
        self.name = f"MyAttr_{self.type!r}"
        self.validators = validators
        self.default=default
        if self.default is not None or type(None) in typ:
            self.__validate__(self.default)
        
    def __set_name__(self, owner, name):
        self.name = name
    
    def __get__(self, instance, owner):
        if not instance: return self
        return instance.__dict__[self.name]

    def __delete__(self, instance):
        del instance.__dict__[self.name]
        
    def __validate__(self, value):
        for validator in self.validators:
            validator(self.name, value)
            
    def __set__(self, instance, value):
        if value == self:
            value = self.default
        if not isinstance(value, self.type):
            raise TypeError(f"{self.name!r} values must be of type {self.type!r}")

        instance.__dict__[self.name] = value
        


#And now

@dataclass
class Person:
    name: str = MyAttr(str,[]) # required attribute, must be a str, cannot be none
    age: float = MyAttr((int, float), [positive_validator,],2) # optional attribute, must be an int >0, defaults to 2
    posessions: Union[list, type(None)] = MyAttr((list, type(None)),[]) # optional attribute in which None is default