Multiple commands, single transaction (all or nothing approach)

1116 · November 11, 2019, 10:39am

Hi everyone. Very excited to try and use axon as main billing platform as replacement to current.
Currently have some concerns due to project specifics.

At a high level view, there are lots of users that are charged by milliseconds interval.
Publishing axon command for each such event I believe is an overkill. But batching them for
too long is not acceptable either ( will be lag in current balance ). So I’ve tried to find a middle-ground here by using kafka
and storing all those micro charges there. Each minute I’m retrieving everything from it, batch and only then publish to axon.
Root aggregate in my case is a user, so when everything is consumed I have lots charge commands, one for each user.
Problem that concerns me is that not all of dispatched commands might be handled successfully by projection handlers.
Also I believe there might be some external problems(i.e network) saving those events to axon. In such cases I don’t wan’t
to commit kafka offsets, just rollback everything.
Found thread with similar problem: https://groups.google.com/forum/#!topic/axonframework/M8SlGXsUsjM

The way to address this problem there was to use BatchCommand and SpringTransactionManager.
Planning to try this out and see how it goes, but what bothers me is this solution looks hacky )

Probably I’m missing something and there is more elegant way to achieve what I’m after ?
Would be glad to hear any advices / thoughts. Thanks!

Ivan_Dugalic · November 11, 2019, 7:44pm

Hi Стас,

Let’s discuss your requirements before we jump into some conclusion here.

From your explanation I can deduce that there is a User aggregate.
How many commands you are expecting for a specific User in a second? We want to make sure that you can scale your command side by applying " axon command for each such event" . Maybe it is not an overkill at all, and it is the best design by my opinion.

Query side can be deployed and scaled independently. With tracking processors you should be able to scale your projectons very good. Making your event handlers indempotent would make your life much easier.
This is a nice blog post that stress some performance tuning practices you can apply in your project (query side): https://axoniq.io/blog-overview/cqrs-replay-performance-tuning
Potentially, you can read about Error handlers as well*:* https://docs.axoniq.io/reference-guide/configuring-infrastructure-components/event-processing/event-processors#error-handling in order to gain more control on errors on your Query side.

Hope to hear from you.

Best,
Ivan

1116 · November 11, 2019, 8:16pm

Hi Ivan.

How many commands you are expecting for a specific User in a second?
Lets say on average 100 commands per user each second and there will be lots of users, lets say 100 again.
This number is changing and depends on lots of factors: it could be active user or not, we could decide to scale, so more nodes are sending
this commands. Ideally it would be great to separate those factors from axon, so that it gets constant amount of load.

Thats why I’ve decided to use some buffer (kafka). I’ve already implemented some tests using batch command and batch command handler aggregate
but it looks too clunky for me

1116 · November 11, 2019, 9:25pm

Thanks a lot for the replay performance tuning article.
Lots of useful tips. Especially Batch optimization paragraph. Maybe there are some examples how to explicitly specify a unit of work for related events ? Not clear enough how to achieve batching behavior using unit of work. ( My first assumption is to aggregate changes from events in UnitOfWork.resource map and then add UnitOfWork.onCommit() handler to simply make one update using results in resource)
Also, is it possible to use one unit of work for multiple commands without batch command ? Couldn’t find a way to pass unit of work to command bus or command gate away.

Ivan_Dugalic · November 11, 2019, 9:45pm

No problem at all. Open source github repo of the author is available here https://github.com/fransvanbuul/cqrs-projection-performance. Please note that this optimization is related to Query side/projection.

Ivan_Dugalic · November 11, 2019, 10:05pm

OK. I understand your concern. You need some kind of a back pressure mechanism or queue to control your flow of commands that you plan to send.

Do you use Axon Server as a command bus? Axon Server will only send messages/commands if the application allows it. You should be able to configure this https://docs.axoniq.io/reference-guide/operations-guide/setting-up-axon-server/tuning#flow-control and take control of the flow.

Additionally you can consider Disruptor as a local command bus https://docs.axoniq.io/reference-guide/configuring-infrastructure-components/command-processing/command-dispatching#disruptorcommandbus. The DisruptorCommandBus takes a different approach to multithreaded processing and it can increase the performance of your application. There are some limitations, so please read the documentation carefully.

Best,
Ivan

1116 · November 12, 2019, 8:44pm

I’ve thought a little a decided to stick to your first reply: scale query side separately.
If something bad happens at projecting side - I’ll always be able to restore them. Using replay tunings really helps a lot.
If something bad happened with axon itself, well there is not much to be done in that case until problem is resolved.

Configuring flow control of axon server bus or using disruptor bus looks promising, but wan’t to keep first iteration simple and look how it goes.
Thanks a lot for the help Ivan !

Ivan_Dugalic · November 13, 2019, 11:01am

No problem. I simply wanted to stress what features are available for you on Axon Server, so you can simplify your design in the future (if needed)

Let us know if you run into some issues.

Best,
Ivan