Tracking Event Processor for Event Replay Reading from Event Store.

Ajinkya_Virkar · January 18, 2018, 4:37pm

Hello,

I have a successfull implementation of the events replay using the tracking event processor.
The PROBLEM is, it seems like the tracking event processor is reading events directly from the events store rather than the event bus.

So, any time the command service pushes a new event (To Event Store and Event Bus) the event handler on query side is invoked and it consumes the event from the store rather then the bus.

My Application Config

App: Spring Boot
Event Bus: Rabbit MQ
Event Store: MongoDB

I want to ask is this the correct design and behavior for a tracking processor?

Simon_Zambrovski · January 18, 2018, 5:06pm

Hi Ajinkya,

in Axon 3 the EventBus is the EventStore. You should not have two of them but only one. (If you look on the code, you will see that the EventStore interface extends EventBus interface.
So usually you will only have one of it - an EventStore is an “advanced” EventBus capable to store and load events.

Does it help?

Kind regards,

Simon

Steven_van_Beelen · January 19, 2018, 9:40am

Hi Ajinkya, Simon,

What Simon says is correct, the EventStore is just a more specified implementation of the EventBus.

Added, you’re saying you’re using a TrackingEventProcessor to be able to replay (nice job on getting that to work!).

That a TrackingEventProcessor is ‘pulling’ events from the EventStore is however intended behavior.

A TrackingEventProcessor keeps track of where it is in handling the event it self and retrieving them, rather than offloading that concern to the EventBus like with a SubscribingEventProcessor.
This makes it so that you can tell a TrackingEventProcessor to for example start a new (a replay), as it is an autonomous process.
If it would have been tied to the EventBus directly (a push mechanism), that would not be easily achieved.

So, this is not a ‘problem’, but by design.

Hope this clarifies it.

Cheers,

Steven

Ajinkya_Virkar · January 19, 2018, 8:56pm

Hello Simon / Steven,

Thanks for the answers, I have some questions and concerns

Questions:

So is the tracking processor polling the MongoDB (Event Store) to keep a track of all new events that are posted to the store.
My understanding is that a Subscribing Event Processor is based on RabbitMQ (or Any queue mechanism) where as Tracking Processor is based on MongoDB (or any Event Store Mechanism). Is that correct?
Why do you think this approach is better than using a queue for message exchanges between command and query. And just use the event store for replays

Concerns:

The following are some concerns about the architecture approach you just explained.

Using MongoDB (Event Store) as a method of sending events from Command to Query seems very tightly coupled.

An application using RabbitMQ (Event Bus) and MongoDB (Event Store) enables me to implement an async event based system. Where the command and query are decoupled and can use RabbitMQ (Event Bus) capabiliites such as guarenteed delivery, deadletter queue and etc. This enables the command service to publish the message to an exchange and spread it over to N consumers listening to diffrerent queues.

My ideal behavior would be to have the tracking processor read events from RabbitMQ (Event Bus) in a Business As Usual (BAU) scenario and then use MongoDB (Event Store) only to replay events.

Steven_van_Beelen · January 22, 2018, 9:09am

Hi Ajinkya,

Answers:

A TrackingEventProcessor does a combination of peeking the domain_event_entry table for the last events it’s global index and will match that with the index contained in the token column of the token_entry table. If those tables are in JPA, JDBC, MongoDB or any custom database, is not particularly relevant to the approach.
A SubscribingEventProcessor is never forced through a queuing mechanism. That’s all up to the user of the framework if you want to use a queue or not. If you start monolith-first (which we typically advocate because of lesser complexity) with nothing specified around your Event Handling Components / Event Listeners, Axon will by default put those beans in SubscribingEventProcessor which get subscribed directly to the EventBus. If you plan to use queuing for you SubscribingEventProcessors, you (as you’re probably already doing) can use the axon-amqp dependency to use the MessageSource and Publisher set up provided. The answer to the TrackingEventProcessor is also more intricate than you’re depicting right now. A TrackingEventProcessor requires a StreamableMesageSource. The EventBus implementation is a specification of a StreamableMesageSource, so if you’re using the defaults, a TrackingEventProcessor will receive the EventBus as it’s source of events (so in your example, your MongoDB EventStore. You can however implement your own StreamableMesageSource` if you’d like, which just might use the queues you’ve mentioned earlier.
I don’t typically think it’s better, but I’m making the assumption that you’re still in the development stages of your application. As I gave a short sneak peak on in answer 2, we typically tell people to start simple, thus probably containing all your services in the same application, and break up when necessary. Taking that stance, it’s not necessary to think of queues yet at all, thus letting you focus more on the business value of you application rather then the infrastructure bits you’re focusing on right now.

Concern:

So, you’re concern is that using MongoDB as the mode of sending/receiving events is more tightly coupled then using a queue for sending/receiving. Why is a database making your more tightly coupled then a queue? Your events are the coupling between your applications in this sense, not how they get from A to B. So it doesn’t make you more coupled if your use the database compared to a queue, but it does give you the benefit of a more secure location of event storage than a queue does. You describe the queues benefits as guaranteed delivery/deadletter queue/etc, but I’d argue your store is a far more reliable source of your events, as it is its sole purpose to contain all the events which have happened.

I hope I explained my self correctly here, please feel free to further the discussion on your concerns as I hope I can help you with them!

Cheers,

Steven

Ajinkya_Virkar · January 24, 2018, 4:39am

Hello Steven,

Thanks for a detailed explanation. So from your comment can I make the following assumptions.

For a CQRS / ES application using Axon 3+ I can have message exchange via a Event Store (MongoDB) and that is as per the design. It is not mandatory to use Event Bus (Rabbit MQ) for message exchange
A tracking processor can be used for regular event exchange as well as event replay within the same query component. So the event exchange will happen via Event Store.
From your answer 2 and 3, Can you provide some pointers, examples. If I have to implement my custom behavior (Stremable Message Source) event bus (Rabbit) for regular message exchange and event store (Mongo) in case of replays.

Question.

Will there be a performance impact if there are multiple query services polling the same Event Store for events?
From 2 and 3 if I implemnent my own stremable message source, is it possible to have a combo (event bus for regular event exchange and event store for replays ). Or I can only use either of it.
Extending to 2 If I implement my own stremable message source to use rabbit instead of mongo where will the events be stored?

Steven_van_Beelen · January 24, 2018, 12:11pm

Hi Ajinkya,

No problem, I’m here to help!

So, let’s proceed with your follow up assumptions:

Yep, that’s true, it’s fine to leverage the store in that form. I’m currently on a project where we’re doing exactly that. That, indeed, does not force somebody to use RabbitMQ to exchange message, although your are definitely allowed to keep using it if it makes more sense in your architecture. For this matter, it all depends on what your needs are.
A TrackingEventProcessor is designed to work as a solo unit in regard to events on the store. The way it keeps track of things is with the TrackingToken and through that token you’ve got the option to replay (by removing it for example). Event exchanging will indeed happen through the store, as the TrackingEventProcessor pulls from the store itself.
Don’t think it’s the cleanest approach (as that’s just using a TrackingEventProcessor), but if you would implement your own StreamableMessageSource which is fed by a Queue, you’d probably have to introduce another store on the receiving end. If you don’t do something like that, you’ll not be able to replay easily enough, as a replay requires you to read all the events from the store. Or maybe you could have some way of telling the sending party that you need all the events again through an API call…but still, I think this does not sound like a clean approach at all.

Now, for you question:

More connections always means more I/O, so some form of impact might be present. However events are not pulled row per row, but batches of events are retrieved. This should minimize the amount of queries on the store. If you run into difficulties with the default settings for querying, I think checking how to speed up on your chosen storage party (MongoDB in your case) makes sense. If you have really high performance requirements, you could think about leveraging the AxonDB, which should kill all your potential worries for ever on this part.
Not sure I fully comprehend what you mean here. In Axon, the technical implementation of the EventStore is just a more specified version of an EventBus. So using a store means you already have both, but it’s just a single unit rather than two separate things.
Like I shared on my response of your third assumption, I think you’ll have to introduce a second store on your receiving end to be able to allow replays solely based in that application. And, like I also shared, I don’t think this is an ideal situation to be in.

The way you’re focusing your questions/assumptions lets me believe you might be, or eventually end up in, a an application environment with a very high amount of event throughput. Like I shared in question.1, I think the AxonDB might be a nice thing to look at.

Hope this helps!

Cheers,

Steven

Ajinkya_Virkar · January 24, 2018, 2:22pm

Hello Steve,

All that was helpful information for me to proceed with the application. It is a microservices based application designed to scale every component individually and keep them very loosely coupled.

Once again it was a great help.

Steven_van_Beelen · January 24, 2018, 3:48pm

Hi Ajinkya,

Happy to have been of help to you!

If you have any follow up question, please do not hesitate to post them in the user group.

Cheers,
Steven

Ajinkya_Virkar · January 24, 2018, 8:39pm

Hello Steven,

So this is going to be our final architecture. Can you please provide your suggestions and comments.

Steven_van_Beelen · January 30, 2018, 11:03am

Hi Ajinkya,

Sorry for the late reply, this mail somehow fell through the cracks of my mind.

To give an answer to your question, I think that diagram is fine!

I’ll give summarization of what I think this diagram shows.
So you decide to separate certain query models in application ‘View 1’ and application ‘View 2’.

Both of these applications maintain different query models I presume?

Because of that, they’re also in charge of their own token store, which contains the progress of the TrackingEventProcessors for that specific ‘View-Application’.

Next to your query model applications, you’ve got a Command Model applications (likely containing your aggregates). You’ve modeled the EventStore which stores the events from those aggregates as being part of that Command Model application.

Is my summarization of it all correct?

If so, again, I think this is a nice starting point for you guys.

Hope this helps!

Cheers,

Steven

Ajinkya_Virkar · February 1, 2018, 8:18pm

Hello Steven,

Thanks for the summarization, Yes you are completely right about everything. Thanks for that.