clustering an axon based microservice


We have developed a microservice backend using Axon. We have polymer, angular as well as pure java clients accessing the service. For availability purposes we are going to cluster it using 4 nodes behind a load balancer. Given that there are concurrency requirements (same AR modified by different clients at the same time), we will need to configure axon using jgroups to coordinate commands. So far so good, all known territory.

However: certain commands actually should send a meaningful response to the client: for example when AR is updated -> we send back a list of ‘valid’ items for that the AR. Until now, this response content was derived in the @EventHandler methods and set as a threadlocal, so that the RestController could pick it up and return it to the client. With jgroups this is obviously not going to work anymore. So I am wondering how other people are handling this ? A few options are

  • make the @CommandHandler derive and return this information. It seems not the most natural way, and currently the AR does not even contain all data required for this.

  • use sse, and try to connect the client with the handling node of the eventhandler, seems to complicate things unnecessarily IMO

  • derive the return payload in the restcontroller, after the command dispatching and event handling is complete.

Any other suggestions how people handle this ?


Hi Jorg,

you have basically identified your options.
It seems like your API design isn’t very suitable for asynchronous applications. So you might want to reconsider that a bit.

Personally, I have implemented a few cases of option 2. Although it seems like a lot of work, it is actually a very clean API where information is sent the moment it becomes available in a view model. It fits the async nature of Event Driven Microservices very well.

Option 3 is also a good one, if you can somehow guarantee that the model is updated. You could check the sequence number of events (i.e. aggregate version) that were included in your view model and validate the staleness of your view model based on that. If your reply contains data which is “too old”, query again. Just make sure your database doesn’t use “repeatable read” isolation level…



Thanks for your thoughts Allard.

I see some issues with option 2 though. The client would need to open an sse connection to the machine that the command is dispatched to. So it should use the same consistent hashing algo that axon uses to determine that node, seems slightly fragile. Alternatively you would need to broadcast the data to all nodes, and the one that happens to have the relevant sse connection to the client sends the data over the wire. Seems hmmmm.


Hi Jorg,

Can you briefly describe option 2. I have no idea what ‘sse’ means?


Hi Benoit,

sse = server sent event. Think of it as a persistent connection that, once initiated by the browser, can be used by the server to send data back. It’s a bit like websockets but easier to get going because it uses standard http (unlike websockets which is TCP). When the server side is doing a lot of async processing it’s a nice way to send something back to the client outside of the traditional req-resp cycle.