Axon performance observations around submitting commands

sanjayc1981 · September 13, 2023, 10:21am

Hi,
I am observing seeing some key latency issues around command submissions, and delays observed in joins to command completion or else delays in callback calling.
following are some use cases

submit say 5 commands, leading to 5 events on two aggregates. Even the events generated by the commands complete in 1 msec. However the join of the command completion is taking like 25 msecs which is too huge.
similar to above, see another case, submit command with one callback, the callback taking another 8 to 10 msecs to be called, ideally the aggregate must have been completed in 1 msec. This kind of latency leading to delays of subsequent command firing.

I wouldn’t expect the command threads to wait for event bus threads to complete, including event persistence costs.
Using axon 3.1 version

Gerard · September 13, 2023, 10:45am

Hi Sanjay

You do want to wait, in most cases, as it’s the only way to know the commands are truly successful. There might be a concurrent request for the same aggregate, meaning one will fail.
To be faster, the first move would be to move to the latest, 4.8 version of AF. Another thing that might help make command handling a lot faster is using Axon Server for the routing.

sanjayc1981 · September 14, 2023, 9:38am

Hi Gerard
thanks so much for responding so promptly !
I suspect we have some inefficiencies with out event and snapshot persistence custom solution as well.
I would like to get clarity on one more point though… with regards to our custom implementation of AbstractEventStorageEngine

if in persistence, for an aggregate, we already persisted event sequence numbers 1-10 for a given aggregate id, and then thereafter a snapshot event has been called, is there any value add to maintain old events from 1-10 for long term ? implementation wise it may seem mandated, though wondering if axon would need events older to snapshot versions.

From a storage point of view, i was wondering if it was ok to delete old events persisted, if we had snapshot and further events to that aggregate.
e.g. event 1-10, snapshot, events, 11-15. is it ok to purge events 1-10, as i would assume axon would recreate aggregate object by applying snapshot1 + applying events 11-15.

Perhaps only risk would be if the aggregate itself got corrupted, and axon has some fallback to recover based on events 1-15.

Regards
Sanjay Gopalakrishnan

Gerard · September 14, 2023, 1:54pm

Hi Sanjay,

There are a couple of reasons to keep the event ‘forever’. First and foremost, any projection should be able to be rebuilt from scratch. This means the event processors should be able to reset/start from the beginning. This keeps your application easily adaptable. You might also need different projections for different use cases at some point, which will require all events for consistency. The only way ‘around’ this would be if you have short-lived aggregates, and some events lose value after some time.

The second reason is that your command model might change at some time, in which case you might not be able to use the snapshot anymore, as it might not be compatible. Again, if there are only shore-lived aggregates, you might at some point clean them up, as you are sure there will be no commands for such an aggregate anymore.

Thirdly, you might want to leverage the events to create an AI model at some point. In this case, the more data you have, the better, even if it’s a few years old. Since storing data is relatively cheap compared to the value it might bring, there should be hardly any reason for wanting to remove events.

I hope this was helpfull