Hello again everyone,
I have been writing some messages lately because we are struggling a lot with our Axon based application performance under heavy loads. Here’s another (more general) question I had last week (for reference): https://groups.google.com/forum/#!topic/axonframework/nDz5_fM6ueU
This time however I have a more specific question… So, while running load tests on our application we experience big lags in the overall response times of certain requests. I managed to locate an example of where we are getting such a lag in the Event Store:
Event1 2020-06-18T13:07:40.107093Z 2294 Event2 2020-06-18T13:07:45.010161Z 2295
There are three values in each row. Event name, timestamp, and sequence number. As can be seen above there is a difference of ~5 seconds between the two persisted events. However, the logic that generates that event is something like:
@CommandHandler fun handle(command: Command1) { // Perform some checks with the aggregate data (super cheap operation) AggregateLifecycle.apply(Event1()) // Make HTTPCall to a different service. This calls takes miliseconds, 200ms top, but normally way less. I have checked than in this particular case, the time for this request was < 200ms AggregateLifecycle.apply(Event2()) }
Considering all the above information I have a few questions. When does Axon persists the events in the database? Does it try to persist all events applied during the processing of a command in a batch when the command processing is completed? Does it persist the events right when they are applied (AggregateLifecycle.apply())? How does Axon work internally in that respect? Also, given that both events are being applied in the same command handler and there are no expensive operations between applying the two events… what may be causing that huge difference in the timestamps? Again, this only happens during high load periods… There are no obvious error logs around (no error logs at all for that matter) and the logic ends up executing properly. The aggregate ends up in the correct state and there are no errors whatsoever.
Some of my wild guesses are that the database is not able to cope with that many inserts and it’s blocked somehow. Maybe it’s not the database but the DB connection pool configuration in our service? I don’t really know. In any case, any help/ideas or piece of advice would really be appreciated.
Our Event Store is in a MySQL database, we use snapshotting, we use caching… Please do let me know if there’s any other information that would be useful to understand this issue.
Thanks,
Armando.