Scaling Event Processors

Good day all, I have a few questions. We are using Kafka as an underlying message source for a microservice that utilizes Axon’s Saga support. We are seeing less than desirable perf test throughput when using a SEP.

  1. When using subscribing mode, is there any way to scale event processing processing aside deploying multiple instances, increasing topic partitions, setting ScribibleKafkaMessageSource.Builder.consumerCount to match the number of partitions?

  2. As per my understanding, using tracking mode while configuring an initial-segment-count/thread-count is a way to scale within an instance itself, however I came across this post and wondered if using Axon Server is a requisite having the scaling work. With a JPA backend, I see the token_entry table with multiple entries, one for each segment. I just want to verify this functionality is present without Axon Server in v4.1+.

  3. When using a TEP, can you scale out your instances horizontally without worry? I’m pretty sure this is a yes, but I want to verify.

  4. Is it advisable to use only SEP or TEP for both producing and consuming or can you comfortably have both within the same JVM?

  5. I’m wondering if any of the following settings can help improve performance as well.

  • Using a TRANSACTIONAL confirmation-mode and setting DefaultProducerFactory.Builder.producerCacheSize(). If the confirmation-mode is NONE, will setting the cache size have any effect?
  • Using a FullConcurrencyPolicy where multiple threads may be spun up.
  1. Lastly, we’re developing a custom CassandraTokenStore but are facing issues with TEP and we’re noticing duplicate entries in the token_entry table. Surely the implementation needs review, but from an Axon perspective, is there an inherit reason the TEP API wouldn’t work well with a Cassandra based token store (e.g. locking, ACID vs BASE)?

Thanks to all participating in advance!

Hi Ryan, let me go over your questions one by one.

  1. You have summed your options nicely when it comes to SubscribingEventProcessors. From the framework’s base perspective, there actually is little use for the SubscribingEventProcessor. It’s purely there for connecting with other messaging solutions, like AMQP or Kafka.

  2. Axon Server is no prerequisite for using the StreamingEventProcessor instance like the TrackingEventProcessor. The requirement is that it should use a StreamableMessageSource implementation, actually. Axon Server can be such a source, but Kafka and an RDBMS can be that as well. For simplified scaling, you will require the split and merge functionality, though, that’s implemented as of Axon 4.1. This makes it split and merge functionality problematic with the Kafka extension, based on Axon 4.0.

  3. Short answer, yes.

  4. A mix of both SEP and TEP for producing and consuming events should be fine. The choice depends on the non-functional requirements of the component in question.

  5. The FullConcurrencyPolicy will ensure there will be no ordering between event handling. So it will increase throughput, but it will likely introduce issues from the perspective of updating models. The cache size will indeed have no impact if you are not using the mode TRANSACTIONAL. In the DefaultProducerFactory, a single ShareableProducer (custom to the framework) instance is used.

  6. I wouldn’t expect any specific issues for using Cassandra as the token store, honestly. What I would be interested in, granted you/your team is up for it, is a PR providing this Cassandra Token Store back to the framework. Pretty sure more people would benefit from it.

As a side note, I want to point out we have recently constructed a new type of Event Processor, called the PooledStreamingEventProcessor or PSEP for short.
The PSEP has some advantages over the TEP, which are:

  • A single stream is opened per processor instead of a stream per thread.
  • It defaults to 16 segments instead of 1.
  • It does not have the requirement of “thread count >= segment count” since it can claim any amount of tokens
  • It uses thread pools that PSEP instances can share to improve thread use.

Lastly, as you are looking into load balancing, I want to point out that Axon Server includes an automatic load balancer for Event Processors. Definitely buildable yourself, but it would obviously save developer time if you’d simply use Axon Server for this.