Load balancing Kafka consumers with separate micro service instances without Axon server, fails without custom segment identifier

Hi Dharani,

First, I want to point out that the Axon Kafka Extension has not received a final release yet.
I would thus be hesitant to use it for any production environments just yet.

Then, for the issue at hand.
Axon ensures that a given Tracking Event Processor will handle each and every events once.
It does so by keeping track of the events it has handled through the Tracking Tokens.
Additionally, to be able to perform any work, a given Tracking Event Processor thread is required to have a claim on the Tracking Token.

Thus, simply scaling out your applications to have a given Tracking Event Processor on each node does not automatically scale the work.
It simply increases the number of threads, not the number of segments for a given Tracking Token.

To be able to parallelize the work among a given Tracking Event Processor, you thus need to segmentize the Tracking Token into several parts.
Upon start-up of an application, you can configure a given Tracking Event Processor to split it’s token upon creation (thus if no Token is present yet).

You can do this by using the TrackingEventProcessorConfiguration.forParallelProcessing(int).

The provided int chances the number of threads but also the number of segments of it’s Tracking Token.

For a live system or an application for which the Tracking Tokens are already stored, you will have to use the API provided on the TrackingEventProcessor to split and merge tokens.

Granted, the API is provided for you, but the TrackingEventProcessor does not provide any delegation between your nodes to correctly split and merge.
Additionally, note that the split/merge operations are an Axon Framework 4.1 feature.

If you require this functionality out of the box, I do recommend to use Axon Server.
Axon Server’s UI provides operations to split and merge the tokens of a given Tracking Event Processor, and delegates the right operations between any number of application nodes running the given TEP.

Concluding, I wouldn’t take the described custom route you have taken, as you are reintroducing the need correctly build up the delegation of messages between threads which Axon Framework gives you out of the box.
Configuring for several segments can additionally be done up front or on a live system through the split/merge API.
The latter is greatly simplified though Axon Server, thus omitting the need to build this load balancing work yourself.
As such, if you are seeking after simple load balancing on the event handling side of things, I again recommend to take a look at Axon Server.

Hope this sheds some light on the situation.

Cheers,
Steven