Axon Caching With Multiple Data Centers

mgithappagar · June 9, 2022, 12:23pm

Hi Team,

I work for a financial institution and we are using Axon framework since past 4 years for our Spring applications. Recently we have migrated to Redhat openshift cloud with two datacenters for deploying our applications which enabled us to deploy our application on both the data centers which helps with resiliency. However we are facing with Axon caching as the cache is local to data center which results to incorrect state if an aggregate gets passed to different data centers for multiple actions at the same time.

Ex : Consider we have DC1 and DC2 as data centers, we have an external system which sends the actions against an aggregate and those actions gets routed to both data centers as per load balancing and traffic management rules. Here if we receive 20 actions for the same aggregate almost at the same time from external service, 10 of them may get routed to DC1 and other 10 get routed to DC2. Now DC1 has different Aggregate Cache and DC2 has different Aggregate cache which is making state incorrect for the Aggregate.

Can you please let us know how can we maintain caching common to both data centers.

vab2048 · June 12, 2022, 9:38am

Are you using Axon Server for messaging?

Steven_van_Beelen · June 13, 2022, 7:10am

Key in this scenario is using a distributed CommandBus, @mgithappagar.
Essentially, that’s what @vab2048 hints toward with his question.

Axon Server is a distributed message bus and event store. As such, it will ensure your commands are routed in a distributed environment.
The benefit of this, is that a command message is routed consistently between your applications.

More specifically, Axon Framework by default does this based on a field annotated with @RoutingKey. If you investigate the implementation of the @TargetAggregateIdentifier, you will notice it is meta-annotated with @RoutingKey. Due to this, all commands targeted to a given aggregate X are consistently routed to the same node in your environment.

This is done for roughly two reasons: it’s in line with the conceptual requirement of commands and it provides technical benefits. One of these benefits is that you will not get a cache miss, as the cache for a given aggregate only ever resides on one of your application instances.

Maybe it’s interesting to read the Distributed Command Bus section of the reference guide, @mgithappagar. As you can read at the end, Axon Server’s not the only distributed option out there. We also have the Spring Cloud Discovery Extensio and JGroups Extension that provide a similar benefit.

mgithappagar · June 13, 2022, 1:20pm

Thank you @Steven_van_Beelen @vab2048 for looking into it and helping with the details. Yes, we use to have Jgroups with our earlier versions, Last year our our internal platform engineering team changed it to service discovery to help with command routing. Issue we have is, even our service discovery apps also got deployed into two different data centers and they also are not in sync. Will definitely go through the documents shared by you and reach out back if any further help is needed.