Axon or Kafka to support CQRS/ES

KDW · November 12, 2020, 12:16pm

Consider the simple use case in which I want to store product ratings as events in an event store.

I could use two different approaches:

Using Axon: A Rating aggregate is responsible for handling the CreateRatingCommand and sending the RatingCreatedEvent. Sending the event would case the Rating to be stored in the event store. Other event handlers have the possibility to replay the event stream when connecting to the Axon server instance and doing whatever needed with the ratings. In this case, the event handler will be used as a stream processor.
Using Kafka: A KafkaProducer will be used to store a Rating POJO (after proper serialization) in a Kafka topic. Setting the topic’s retention time to indefinite would cause no events to get lost in time. Kafka Streams would in this case be used to do the actual rating processing logic.

Some architectural questions appear to me for both approaches:

When using Axon:

Is there any added value to use Axon (or similar solutions) if there is no real state to be maintained or altered within the aggregate? The aggregate just serves as a “dumb” placeholder for the data, but does not provide any state changing logic.
How does Axon handle multiple event handlers of the same event type? Will they all handle the same event (same aggregate id) in parallel, or is the same event only handled once by one of the handlers?
Are events stored in the Axon event store kept until the end of time?

When using Kafka:

Kafka stores events/messages with the same key in the same partition. How does one select the best value for a key in the use case of user-product ratings? UserId, ProductId or a separate topic for both and publish each event in both topics.
Would it be wise to use a separate topic for each user and each product resulting in a massive amount of topics on the cluster? (Approximately <5k products and >10k users).

I don’t know if SO is the preferred forum for this kind of questions… I was just wondering what you (would) recommend in this particular use case as the best practise. Looking forward to your feedback and feel free to point out other points of thought I missed in the previous questions. Thanks for your feedback.

milendyankov · November 12, 2020, 3:31pm

This question was first posted on StackOverflow and I think @Steven_van_Beelen addressed all the questions in his answer there. I’m linking the answer for reference but as this is somewhat opinion based question it’s indeed out of scope for StackOverflow and I expect it to be closed there. It’s perfectly fine to have the discussion continue here thought.

I wanted to add to the fist question

Is there any added value to use Axon (or similar solutions) if there is no real state to be maintained or altered within the aggregate?

One may argue that using an Aggregate (or using DDD in general) in this particular simple use case is only overcomplicating the matter. Assuming however it is part of more complex system, the way I see it is there are at least two “states” to be maintained

the list of ratings any given product has (and how they have changed over time)
the list of ratings any given user has given to products over time

Depending on where you draw the boundaries, what entities (in DDD sense) you have and what decisions you have to make before you can change the state (for example how you determine if given user can indeed provide given rating to given product?, how many times?, in what time interval?, …) there are different options:

ratings can be an entity in a product aggregate
ratings can be an entity in a user aggregate
ratings can be it’s own aggregate referring other aggregates

So IMHO in a wider system perspective you may see a lot of added value from designing it according to DDD principles. Trying to apply DDD concepts for the sole purpose of using convenient technical concepts (annotations, event handlers, …) can give you the exact opposite results.

KDW · November 15, 2020, 11:32am

Assuming however it is part of more complex system, the way I see it is there are at least two “states” to be maintained

It is indeed part of more complex system. I decided to separate the rating system from the back-office (used for product management, etc…) and build is as a CQRS/ES microservice for two reasons

The back-office is an existing monolith which will be rather difficult (read: very expensive) to migrate to a microservice based architecture. At present there is little reason to redesign it as well…
I expect the system load on the rating subsystem to be much higher compared to the back-office. So a CQRS/ES mircoservice design allows me to be more flexible and scalable.

So IMHO in a wider system perspective you may see a lot of added value from designing it according to DDD principles.

You mention some very interesting points. Until now I focussed on a RatingAggregate instead of product and/or user aggregates in the command service. Mainly because the command service’s only task is (currently) storing ratings as events in the event store. Hence the question whether or not Kafka could or might be a better approach for this purpose.

The real “intelligence” lies in the query service. This is where the rating events are translated into something useful for the end-user (e.g. product ratings). This “state” is then stored in memory in a service specific data store as a ProductRatingEntity. Every new event needs to perform one or more updates to the state.

Design Question #1
Is it considered bad DDD design to implement business logic in a query service? Or should all business logic de-facto reside in command services, even if it is only used for representational (query) purposes?

Using Axon, it becomes very easy to replay all rating events and do some magic with them using EventHandlers in dedicated classes. One aspect isn’t quite clear for me at this point. What is the most convenient way to do (complex) event processing logic using this approach?
One way is to let Axon send all events to Kafka as well, do some stream processing there and send the results to a Kafka topic. Kafka Connect could then be used to store the processed results in a dedicated data store for the query service. This might seem a valid approach, but I see some drawbacks/points of attention

Who should we consider the source of truth?
What about replaying events originating from the source of truth when business logic to process ratings changes?
A lot of overkill in additional services (Zookeeper, Kafka, Kafka Connect) to be set-up, configured and maintained for a relative simple use case.

Design Question #2
How should a “Kafka Streams”-like event processing logic be used using Axon event handler classes?

Design Question #3
Assuming you would integrate all business logic in the command service. How does one implement business logic requiring (complex) event processing in the command service?

milendyankov · November 17, 2020, 2:51pm

I hope more experienced people will provide more details.
I’m not expert in Kafka so I can no answer the Kafka related questions.

But what strikes me in a questions like this is the term “business logic”. It often tends to be ambiguous. For example grouping and counting ratings by geo location of the user can be seen as business logic. It may be that there is good business reason to know how different regions rate different products. It may require dedicated storage implementation compatible with other business needs and/or use specific querying techniques. It may be subject to configurable business polices (which user belongs to which region, are there overlapping regions allowed, …). In all those perspectives it is indeed business logic. However at the end of the day it’s merely a different view of the same data. The way the data is analysed or presented does not alter the state of the system.

I personally don’t see a good reason to move such logic to the command side. The command side is where you ask the system to do something. It may reject the request (when it detects invalid input for example) or perform one or more actions. The outcome of those actions is events changing some state. If the command is merely “give me the data” and no modifications are made to the system, then it’s not a command (in CQRS sense) - it’s a query. Both commands and queries can vary from simple passthrough service to complex components having complicated business logic.

That said, one needs to keep in mid the query side is typically eventually consistent. Therefore it’s generally bad practice to have the command side rely on the query side while it makes decisions that results in state changes. That has nothing to do with the placement of the business logic per se. It’s about ensuring that the part of the business logic that makes decisions and changes does so based on reliable data.

Finally, given that you have existing system that already have some “source of truth” it highly depends on how you would want to evolve it. I personally don’t think Kafka itself can reliably play that role of a source of truth keeper. An event store (DB or AxonServer based) could but that would require changes that (to use your words) will be rather difficult. It is also possible to have the existing system as the source of truth and give up on event sourcing in favour of state stored aggregates. There are many options on the table. I’m afraid only a person who is familiar with the actual system and its constraints and evolution plans is in a position to make that call.