How to retire aggregates and archive events

turnerjasonuk · June 16, 2022, 3:44pm

I am looking at Axon as I would like to use event sourcing for some particular microservices in my system (we are migrating an older space-based architecture to an event based microservice architecture). My challenge is that I have high volumes (always increasing, ~1.5bn entities a day) of fairly short lived (1-10 days) entities which receive only a few commands and thus events, and I need to process those 1.5 bn inside of maybe 4 hours (actually spread through day but with significant spikes).

I have been looking at how to accomplish on a few platforms - trying to find a good fit platform that I can adopt - Axon. Akka, KStreams.

To enable adequate performance on reasonable resources my belief is that I need to be able to ‘archive’ the events for ‘retired’ entities off to another longer term store (most likely as ‘history’ documents in a Doc DB).

So my major challenge looking at Axon is this archiving question.This seems to me (at least from my background area of finance) to be a fairly typical requirement so I am surprised to see that there is no direct support in any of these platforms so far (hopefully I am wrong). If you receive a large number of new aggregates to process every day and you want good performance (and especially if you want reasonable resource requirements) and if the aggregates have a clear limit to their active life in the system then you want to leverage that to keep your active event store lean and offload old but still interesting for audit (i.e. read only) etc data to a separate store.

A typical deployment of our system will be (on premise) ~1.5 bn new aggregates per day, to be processed in around 1-10 days, and then retained for 10 years. I don’t see how that would be viable without such an approach.

Steven_van_Beelen · June 20, 2022, 1:33pm

You’re right, we’re not overly vocal on how to achieve this.
Maybe this stems from the intent of all these platforms to maintain a consistent throughput regardless of the store size? For Axon Server that’s at least one of the major benefits compared to “old-fashioned” relational databases, which degrade performance-wise after certain sizes are reached.

Assuming you are looking towards Axon Framework in combination with Axon Server, I can state there’s a means to configure a secondary storage layer for older data. The main intent for this is to save space on the more costly/quicker disks. So not so much to keep up performance; as stated Axon Server maintains a consistent performance with adding events.

Axon Server achieves this by defining a specific node role to an instance: the SECONDARY role.
The documentation for that can be found here.

Although you’re not talking about Axon Server directly here, given the sizes you’re talking about I do think it’s the feasible step to take. An application that wants to benefit from Event Sourcing that has 1.5 bn new aggregates per day just doesn’t sound feasible to me with regular RDBMS instances. Unless you perhaps have a dedicated database team that enjoys optimizing the RDBMS to become an Event Store, that is.

Nonetheless, let me know what you think about the SECONDARY role support, @turnerjasonuk.

turnerjasonuk · June 23, 2022, 12:31pm

Thanks Steven - I will have a read.

turnerjasonuk · July 1, 2022, 11:15am

Thanks again Steven.

As I said my use case has a few challenges: Although not directly relevant here, we deliver software both on-premise and in our SaaS offering. Deployments/customers will vary, but we would have a high rate of addition of new entities, e.g. 1.5 bn per day. These have a shortish lifecycle, maybe spanning 10 days, then followed by a period of very occasional activity to at most 200 days, with relatively low levels of events per day. However there is a need to then retain a full history effectively indefinitely, although nominally to 10 years typically. Clearly we are able to rely on eventual consistency across services, hence we are able to decompose that way, but we do have strong ordering and guaranteed delivery constraints in and around individual services.

The main challenge around the long retention is that the means of interaction then is via fairly arbitrary search - which means that a good solution for me looks like a service backed by a big document database. Axon is clearly great at coping with very active data where access is to particular aggregate entities. So for me it feels that secondary storage would be great for me to deal with my 1.5 bn x 200 days worth, with primary maybe being focused on the first 10 day period. However it still feels problematic for me to want to keep my whole data across 10 years in Axon. It would be a very hard sell to customers to provision storage for that volume of data (a few trillion entities) given that the actual use of the data is provided by a totally separate store.

So I feel I still need to look for a solution there, to enable me to use Axon well.