Questions re: CQRS Performance Tuning

I was parsing through this blog post by Frans and have two questions:

When using batched processing, you need to be aware of the following when using a non-transactional read model: when processing fails mid-batch, the tracking token won’t be updated and all the events in the batch will be processed again in the next attempt, including the ones that were already processed. It’s advisable to make all projection methods idempotent to deal with this correctly.

Is this second attempt considered a replay? Is it safe to use handlers marked with @DisallowReplay along with batching or will they be adversely affected by multiple attempts?

The downside of these optimizations is that these are not just configuration changes. This requires specific coding in our event handlers and is domain-dependent. The good thing here is that Axon does offer the APIs needed to implement this cleanly. For each batch, there will be a single UnitOfWork. This object can be injected into our event handler methods, by simply adding it as a parameter. It has a ConcurrentHashMap-style ‘getOrComputeResource’ method that allows us to attach other resources to the UnitOfWork. Using this approach, we can do processing at the end of the batch to implement the above-mentioned optimizations.

Does sample code exist for this concept? This seems like it could drastically reduce the number of database calls, which is a major bottleneck in event processing.

Thanks,
Joel

HI @feijoel,

Confident I can give some insights to your questions here, so let’s get to it.

The second attempt Frans is referring to is not necessarily a replay. What happens if handling a batch fails with an exception, is that the following steps will occur:

  1. The transaction is rolled back
  2. The claim on the token is released
  3. The thread will enter an incremental back-off

Now, if this thread failing is the only thread for the TrackingEventProcessor (TEP), it would as part of step 3 retry, after a given back-off. On the other hand if there are idle threads for the TEP they will take over the released claim. Either of these angles is regarded as “the second attempt”.

Secondly, I had to search for sample code to for this blog. There is indeed a repository present on Frans’ personal GitHub, although it’s not overly complete with samples if you ask me. I believe the Projector he has built to test this, is the ProjectorNoop class you can find here.

So, sadly I feel the answer is a little vague here. Yes he’s doing stuff with the UnitOfWork, but I think the sample can be enhanced. As it stands we were actually have a thought of updating the blog you’re referencing, including a more thorough sample. I however haven’t got a clue on what time frame that’ll occur. Thus you can either stay tuned, or try to proceed on what Frans has drafted up there.

Hoping this will help you further @feijoel!

Cheers,
Steven

@Steven_van_Beelen thanks for replying. If understand you correctly, it is not safe to place a handler marked @DisallowReplay in a batched processor, since it could be executed multiple times if any events in the batch fail.

Hi @feijoel,

The @DisallowReplay doesn’t react on such a reattempt indeed.
The reset/replay specific annotations only work if the TrackingToken is actually of type ReplayToken.
So, only when you’ve consciously invoked the TrackingEventProcessor#resetTokens would a @DisallowReplay annotation be taken into account, since that’s when Axon adjusts a TrackingToken to type ReplayToken.

Thus if you’d have an Event Handling Component which both updates a Query Model and sends emails when handling an event, the second attempt would include sending emails again. This is one of the reasons why it’s better to segregate the concerns of event handling operations into distinct components , as there are different non functional requirements when it comes to handling failures for query models and email-sending handlers.

Hope this clarifies things further for you @feijoel!

Cheers,
Steven