Tracking event processor batch optimization

Thom · October 1, 2024, 9:13am

Hi,

I’m trying to optimize the performance of our event handlers when dealing to index documents in elasticsearch through tracking event processors.

For that I’m trying to use batch capabilities offered by axon by building my objects in memory in the context of one batch and calling the bulk index operation in elasticsearch before the commit of the unit of work.

However I have a small issue, sometimes I get conflicts when calling the bulk index because other threads have modified part of the data I’m trying to index.
Then I can throw the exception to the event handler to retry the batch. However I would like to retry only failed documents, not documents that have been properly indexed.
I currently cannot find a way to do that, could someone offer advice ?

Thanks!

Thomas

Steven_van_Beelen · October 16, 2024, 10:17am

First and foremost, as I believe this to be your first post here, welcome to the forum, @Thom!

Now, concerning your request. What I think you are asking is how you adjust the TrackingEventProcessor to selectively replay events from a failed bulk update to ElasticSearch.

The TrackingEventProcessor, nor PooledStreamingEventProcessor, provide a means to filter events during a retry that you know you don’t need for the update of a model. Furthermore, I would not be sure how to construct this support for Axon Framework, as it has zero clue which events succeeded in the batch out of the box. It only knows if the batch failed.

But perhaps the exception that’s thrown during the updating of the documents clearly states how you could filter events that did successfully update a document? If that’s the case, you should be able to add an @ExceptionHandler annotated method to the Projector (read: the class handling the events and updating your ElasticSearch instance). This exception handler can set some state in the Projector to clarify which events should be ignored from the upcoming batch.

For this to work, there are two important pointers to take into consideration:

Does the exception provide you the necessary details to differentiate between failed and successful document updates?
If so, be sure to use the filter ONLY for the upcoming batch of events, as the upcoming batch is the replay. Thus, clear out this store/cache/map/whatever of information once the batch of events was successfully committed.

Let me know if this helps you out, @Thom!