Data privacy: How to actually safely delete events from the event store

Hi,

in some cases, there are data privacy requirements. They might prescribe e.g. that all events for a certain aggregate are really and irreversibly deleted from the event store. Now, the question is, how this could be done with Axon safely, i.e. avoiding out-of-sync caching issues etc.

br
Marek

Hi Marek,

I don’t think a policy will ever force you to delete events entirely. As far as I have seen (GDPR being a good example), only certain (personal) information has to be removed from storage. While this may seem difficult in combination with an event store, the solution is actually rather simple.

We have recently given a webinar about this: https://axoniq.io/events/2017/11/gdpr-webinar.html

Cheers,

Allard

Hi Allard,

thanks, our whole team have just watched the webinar. The idea is great, although there might be some dificulties in our case. E.g. we have currently a generic string based events format: instead of actual event attributes each corresponding to an attribute of the aggregate itself we have a key=attribute name as String to value=new value of attribute (serialized form) + type of change (value change, collection add etc.). This design is quite immune against aggregate type changes and minimizes ES migration effort (upcasting). But may be it would make the use of the Axoniq GDPR module difficult.

Tha is why we are still considering really deleting the events. We probably could live with losing more data then required by doing that (it is the case already in the previous versions of the system). But in his Webinar Frans mentioned also consistency issues. Is it safe for Axon to delete whole event streams in a JPA ES? E.g. would Axon react normally after such deletion when asked to load a aggregates, whose events have been deleted? Should we expect any caching issues, for example?

Of course we would make sure on the application level that the relations between aggregates stay consistent, e.g. there are no “dangling references”.

br
Marek

Hi Marek,

I don’t really understand your argument. You’re saying “instead of actual event attributes each corresponding to an attribute of the aggregate itself”, but that’s an anti-pattern, even when just using classes to represent events. Events describe functional/business behavior, not changes in attributes of an Aggregate.
The AxonIQ GDPR module works against Java classes, but the structure of these classes doesn’t matter much. How you serializer these classes matters even less.

Cheers,

Allard

Hi Allard,

of course, technically speaking it is an anti-pattern. Even if we have a separate class for each event type resulting from a command type, that represent a distinguished business logic action. But all these event classes have that key-value-store-like contents. We will have approx. 100 event classes just for one of the aggregates, which has around 100 different attributes altogether. You may think this as such is a bad design, but the domain of this system is German law… :wink: A world away from a customer-order example :wink: I am quite sure, if we define normal event classes, each of them having in essence a subset of those 100 attributes, we may need often to write upcasters as bugs and extensions in the command handlers are found and, as a consequence, event types will get changed in the future… Still, you are right, we are still considering to change to normal events…

Serialization was not my point, we fully understand the approach of AxonIQ GDPR. Besides the final deletion we have also the requirement of anonymisation, which AxonIQ GDPR also would solve.

Another issue is of course is the cost AxonIQ GDPR.

But that was not my question, actually ;-( The question still is the deletion of events in Axon. Could you pls say if is safe to do it from technical perspective with Axon?

br
Marek

Hi Marek,

I didn’t mean that events with a lot of properties is an anti-pattern, but considering these properties as properties of an Aggregate is. I’ve been involved in plenty of projects that go way beyond the simple use cases.
If you want to know about the pricing of the GDPR module, please send an email to sales@axoniq.io. I’m sure they’ll be happy to answer :wink:

From an Axon perspective, it’s usually not a problem to remove events. Just make sure your application is fine with not having them (in a replay, for instance). If problems occur, they are most likely to be because of how the application expects certain things, not so much Axon itself.

Cheers,

Allard

Hi Allard,

sure you (and Axon) have been involved in really complex systems, I didn’t mean that that way. We have chosen Axon for that reason.

I’ve got you from the beginning, I hope :wink: Events describe a change in the past in the state of one or more Aggregates and they should contain only properties needed to specify what actually happened. Those may or may not be identical to some properties of an Aggregate.
In our case, they actually seem to be always identical to those properties so far.

Thanks,

M.

Hi all,

Whether you can safely delete events depends on the events you’re going to delete, and how you model your events & aggregates. I’ve done this many times on different projects (not because of the GDPR, but because of other reasons), and never had a problem. Of course, you need to update projections that still expose data that used to be part of the events you just deleted.

cheers,

Michiel