Event Sourcing: proper way of rolling back aggregate state

How the compensation events are generated should be the concern of the Story aggregate (after all, that's the point of an aggregate in event sourcing - it's just the validator of commands and generator of events for a particular stream).

Presumably you are following something like a typical CQRS/ES flow:

  • client sends an Undo command, which presumably says what version it wants to undo back to, and what story it is targetting
  • The Undo Command Handler loads the Story aggregate in the usual way, either possibly from a snapshot and/or by applying the aggregate's events to the aggregate.
  • In some way, the command is passed to the aggregate (possibly a method call with args extracted from the command, or just passing the command directly to the aggregate)
  • The aggregate "returns" in some way the events to persist, assuming the undo command is valid. These are the compensating events.
  • compute the compensation event for each of the occurred events

...

Unfortunately, the second step of the previous procedure is not always possible

Why not? The aggregate has been passed all previous events, so what does it need that it doesn't have? The aggregate doesn't just see the events you want to roll back, it necessarily processes all events for that aggregate ever.

You have two options really - reduce the book-keeping that the aggregate needs to do by having the command handler help out in some way, or the whole process is managed internally by the aggregate.

Command handler helps out: The command handler extracts from the command the version the user wants to roll back to, and then recreates the aggregate as-of that version (applying events in the usual way), in addition to creating the current aggregate. Then the old aggregate gets passed to the aggregate's undo method along with the command, so that the aggregate can then do state comparison more easily.

You might consider this to be a bit hacky, but it seems moderately harmless, and could significantly simplify the aggregate code.

Aggregate is on its own: As events are applied to the aggregate, it adds to its state whatever book-keeping it needs to be able to compute the compensating events if it receives an undo command. This could be a map of compensating events, pre-computed, a list of every previous state that can potentially be reverted to (to allow state comparison), the list of events the aggregate has processed (so it can compute the previous state itself in the undo method), or whatever it needs, and it just stores it in its in-memory state (and snapshot state, if applicable).

The main concern with the aggregate doing it on its own is performance - if the size of the book-keeping state is large, the simplification of allowing the command handler to pass the previous state would be worthwhile. In any case, you should be able to switch between the approaches at any time in the future without any issues (except possibly needing to rebuild your snapshots, if you have them).


My 2 cents.

For rollback operation, an orchestration class will be responsible to handle it. It will publish a aggregate_modify_generated event and a projection on the other end for this event will fetch the current state of the aggregates after receiving it. Now when any of the aggregate failed, it should generate a failure event, upon receiving it, orchestration class will generate a aggregate_modify_rollback event that will received by that projection and will set aggregate state with the previously fetched state .
One common projector can do the task, because the events will have aggregate id.