How exactely does axon ensure that EventHandlers in distributed microservices receive a published event using axon server?

uhm_dunno · June 14, 2019, 9:52am

Let’s say I have a saga in microservice 1 that has a SagaEventHandler for eventXXX.

EventXXX gets produced by microservice 2 via event sourcing. It will be scheduled for publication on the event bus and then all listeners, in this case the saga in microservice 1, will be notified.

But what if microservice 1 goes down while microservice 2 publishes the event? It is no longer registered/subscribed in axon server, and after trying some I noticed it won’t get notified with the event after it comes back up. This means the saga will never finish and the data that was event sourced will be inconsistent.

What am I missing here?

Polish_Civil · June 14, 2019, 9:26pm

Hello,

I dunno what happens in your system, by all means it should just work.
Every event published by aggregate gets stored in the event store, unless you use in memory embedded event bus, so you might have some configuration conflict.

In my distributed system everything works just fine. Sagas have tracking processors for the events they need to process, each time they receive event they are saving the state about what events they had previously processed, therefore whatever happens to the microservice this event will be processed once it gets back up.

uhm_dunno · June 15, 2019, 7:22am

Hi Robert,

yes, it is true, the saga remembers which events it already processed. But here is my issue:

The saga in microservice 1 reacts to eventXXX and issues a command to microservice 2. Then it goes down. Microservice 2 produces and saves an eventYYY to the event store. Now whenever i restart the microservice 1 containing the saga, it will replay from eventXXX, sending the command again (which was already processed and caused an aggregate to event source), instead of continuing with eventYYY which was published after it went down. It creates a second eventYYY, while the first one now causes inconsistent data in read models, because it also event sourced an aggregate.

Polish_Civil · June 15, 2019, 7:47am

Well it should not replay from event XXX as it were tracked that it has published it.

However what you have here is more of a design issue, which i stumbled upon recently, https://groups.google.com/forum/#!msg/axonframework/CifsMbJzYk8/0leRHAJxCAAJ

My solution was that i store every state that drives the command dispatching in the aggregate, meaning if i’m communicating with the external microservice i try to store the state of the communication within some other aggregate.
It quite fits with most of my aggregates, however when the situation is trivial, for example when i just send few commands and want to ensure that i send them once, i put the boolean flag on the aggregate class itself so when the saga is reloaded i can use it to decide whether i should send the command or not.
It quite strikes me though because i would rather not use any external storage just for sagas, I use event sourced system so i tend to use aggregate state to drive my saga instead persisting saga state, this way i purely rely on the events in my event store, and therefore the only thing i do in my sagas is dispatching proper commands when i receive an event.

Polish_Civil · June 15, 2019, 8:00am

I just realized that the aggregate thing is a bad tip, here’s why: https://stackoverflow.com/questions/33429626/eventsourced-saga-implementation

So I’m guessing that the state of a saga, is the only proper way togo

Polish_Civil · June 15, 2019, 8:08am

I’m really sorry for the spam, but i’d like to point out another thing.

Considering 2 Events on 2 Services : S1: Started, S2: Created having a saga:

Start -> S1: Started

sendCreateCommandToS2
End -> S2

If we do it like this, and let’s suppose we replay an event S1:Started, even if we persist that we have sent the command, if im not mistaken the saga persisted state will be pruned when the saga ends, therefore on each replay we would be sending the command again and again.
The create command in this scenario is just init command for some aggregate in external service, if we do not make the aggregate identifier idempotent based on some value of S1:Started event we would end up creating bunch of aggregates on S2. ( This is preety much exactly why i use the aggregate to hold the state for my saga things, it creates additional command and events, but im ensuring that the id idempotent, by generating GUID only once)

So, I also have the follow up question, its more of DDD and designing one: How do we ensure that our saga behavior is idempotent? This is so tricky

uhm_dunno · June 15, 2019, 8:46am

Ok so I found the mistake here…I’m an idiot…I was using an h2 database and forgot to configure it to persist to a file…which caused the token for the saga to be deleted/reset with every start of the JVM.