Batching the EventStore write operations and Event Handlers

Oleg_Iavorskyi · March 30, 2015, 9:41pm

Hi,

During performance testing of the application based on Axon we have found that the more events been generated the lower end-to-end throughput becomes. We use Event Sourcing with JPA-based EventStore and it looks like INSERT statements execution time increases over time. Would it be possible to use the batching for writing events from multiple aggregates? Or maybe there any best practices to optimize the performance of IO?

Similar question is on Event Handlers. If we have specific event handler that has IO would it be possible to batch multiple events for handling by same handler?

Thank you.

Oleg.

Allard · March 31, 2015, 12:49pm

Hi Oleg,

insert statements require the database to update both indexes and the tables themselves. With the index becoming bigger, the database needs to traverse more pages to get to the relevant positions. How big is your event store? And how big is the decrease? There is probably a lot of tweaking you can do on the database-level to improve performance here.
Another solution would be to move “old” events to another storage, perhaps with less indexes.

You can batch the work of Event Handlers by using an Asynchronous Cluster. The normal clusters process events in the thread dispatching them.

Hope this helps.
Cheers,

Allard

Oleg_Iavorskyi · March 31, 2015, 6:09pm

Allard,

Thank you for prompt response. We have collected additional statistics for the database. The most important observation is that decrease in performance happens over the time even if size of dataset does not change much. For example, we have about 200k events in the store and 50k saga instances in the saga repository at the beginning of the test. After pushing load for 15-20 minutes the size of storage increases to about 260k/60k with 2-4 times decrease in performance. If we restart instance without cleaning anything from the store the performance is back to original again for 15-20 minutes before dropping.

Closer to the end of the test we start receiving slow queries reports for INSERT into DomainEventEntry and SELECT from SagaEntry tables. However, according to AWR statistics the actual elapsed time for both queries is close to 60ms on average. The top wait events according to AWR are:

The row lock contention is for queries related to Quartz scheduler we use in Saga to schedule events and I don’t think there is much we could do about it as it is part of Quartz design to coordinate the scheduling between multiple instances.

The “log fine sync” issue is because of lots of frequent COMMIT and there are two ways to optimize it - either optimize redo logs writing performance or make less COMMIT. The last option was the reason that I asked for JDBC batching support in Axon. By the design of application we have lot of simple aggregates with relatively short lifecycle. As a result there are lot of domain events generated often and as each of them being persisted right away we see a lot of commits.

Nevertheless I does not explain why performance restores after restart of application even though same set of data is present in store.

Regarding my last question I think I was not really clear. My goal is to collect multiple event of same type but of different aggregates over some kind of sliding window (or just simple buffer) and then use one thread to process all of them in batch. Right now Axon calls each of the handlers with one event but I wonder if there was a way to specify that I want handler to receive list of all events that were not yet processed. Similar to how Disruptor uses endOfBatch flag.

Thank you!

Oleg

Oleg_Iavorskyi · March 31, 2015, 6:12pm

It doesn’t look that image has been attached properly. Here is the link - http://pasteboard.co/2dLUcCDl.png

Allard · April 2, 2015, 12:11pm

Hi Oleg,

I have no clue as to why appending events slows down. I am not aware of any places where data is kept in Axon, so it might have something to do with Hibernate. It’s an interesting observation though, and I’ll see if I can run some tests myself to reproduce (and ultimately, locate) it.

As to the batching, you can use an Asynchronous Cluster. It uses queues from which events are processed in batches. If you use a sequential concurrency policy, you’ll automatically end up with a single thread executing events in batches. You can also configure the batch size.

Cheers,

Allard