Replaying Events from the Event Store

Carlus_Henry · November 16, 2011, 1:43am

Hey everyone,

I have heard that one of the major benefits to using CQRS, is the ability to replay your events from the event store to the event bus. With this ability, you can drop all of your view caches, and have them repopulated from the events that are replayed. Does anyone have any practical experience with doing this? Is there anything in Axon that could assist in this task? How would one go about doing something like this?

Along the same lines, if I was interested in adding a new event handler that would populate a new view cache, would I have to

1.) Drop all of the view caches
2.) Deploy the new handler
3.) Replay all of the events to update the new view caches as well as the pre-existing that was dropped…

…or…

Is there a way to deploy the new event handler, and have it populate it’s view cache without dropping the other view caches.

Thanks
Carlus

Chad_Wilson · November 16, 2011, 1:52am

A quick and dirty solution is to write your view cache mechanism to
request the events from the event store (though I tend to use the
Aggregate for this since it's already baked in) when a cache entry
does not exist. An example would be you have XML documents on a file
server as your view cache. If the file isn't found, fall back to the
aggregate and see if it's there. If it is, generate the view cache
file, store, and return to caller.

I tend to view it not as the events getting pushed to the query layer,
but the query layer calling the events as a means of last resort. In
fact, there is no reason why you cannot use the event store /
aggregate as the query storage mechanism, CQRS just states that there
is a separate model, not neccesarily a separate persistence mechanism.

Chad

Allard · November 16, 2011, 8:09am

Hi guys,

before responding to Chad’s solution, first a little thing about replaying.

There is no built-in method for replaying events, yet. However, it’s very easy to build. One thing to take special care with is to never replay your events on the EventBus. You’re very likely to have handlers there that don’t support replying, such as saga’s. Replaying them would cause commands to be generating, changing your application’s state, instead of rebuilding it.
In my design, a single Event Handler is responsible for updating one or more related tables. If I want to rebuild these tables, I clear them and replay all events from the event store on that single handler.
The JpaEventStore has a method “visitEvents”, which allows you to provide a callback that receives each event from the event store, ordered by their timestamp.

I have had a discussion with some colleagues about providing support for this in the framework. The idea is that you can implement the “Replayable” interface interface (and I’ll probably think of an annotation as well), which exposes a JMX (it will be pluggable, you can also have other management interfaces, such as a servlet) operation that allows you to trigger a replay.

Chad’s solution is a very different approach, comparable to a Pull method. In that case the query layer will regularly (either time-based or quey-based) do a pull of any new events in the store. I don’t have any experience with this approach. I see too much advantage is the push method (decoupling, for one).

Cheers,

Allard

Michael_Schnell · November 16, 2011, 8:56pm

Hi!

Here is an example how you can build something to replay events:
http://code.google.com/p/axon-auction-example/source/browse/#svn%2Fwww%2Ftmp

It uses reflection to find all queries and the events that a query
listens to.

The naming is a bit outdated... In fact "QueryManager" is now named
"Denormalizer".

Cheers,
Michael

Chad_Wilson · November 16, 2011, 11:10pm

Actually my solution was just for the edge case where a query object
did not exist, a fall back to the aggregate if you will. The benefit
in using the aggregate to create a DTO is you don't need to worry
about the event bus, or replaying the events wrong. Typically I use
the aggregate to generate a DTO as the query object when I first start
creating the models, and then I work from there to build out the query
layer. Also, for light read scenarios, I've found the aggregate is
performant enough for regular use (YMMV).

Chad

Per_Wiklander · November 17, 2011, 6:27pm

Interesting! BTW, a little bit OT, but how is the Meta CQRS coming
along? It would be really nice to see something if you have anything
ready or even half ready.

Carlus_Henry · November 17, 2011, 6:50pm

Follow up question regarding replaying events…?

If I have a brand new event handler in an existing system. I would like to bring that Event Handler up to a point where it has seen all of the events that have happened in the past so that it has produced the most up-to-date information. Regardless of the approach to do this, I still want it to process newer events, after it has been brought up to a good and known state.

I could send all of the events to this new event handler, however, it is entirely possible that new events would be sent there as well. Wouldn’t they get mixed up? How do I handle the situation so that I first load up all the historical events, then load in all of the new events (regardless if they were published to me before I had completed the historical events)?

Thoughts?
Carlus

Michael_Schnell · November 17, 2011, 6:57pm

Hi Per,

I wish I had more time... My current freelance job consumed most of my
time over the last few months.

At the moment I'm preparing a "plain DDD" Java model to start a
discussion about it.

I'll post an URL when the "DDD snapshot" is available.

Cheers,
Michael

Michael_Schnell · November 17, 2011, 7:01pm

Hi,

You'll need some kind of queue for new events arriving while you
rebuild the query.
After you finished the "replay" you simply process all queued events.

Cheers,
Michael

Allard · November 17, 2011, 7:27pm

While you are replaying the events, you should remember the ID of the last event processed. Then you can discard any messages from the queue up to (and including) the message with that ID from the queue. That will prevent duplicate processing.

Michael, I’ve been thinking about the Meta-CQRS lately. There seems to be a lot of demand for a way to generate dependency graphs based on code. I’ve played around with ASM (byte code analysis) and have the feeling that it should be doable to generate such a structure based on that.
I remember that you had some ideas about a DSL to describe these dependencies. Or am I confused with something else?

Something in the lines of
MyCommand -> MyAggregate -> {SomeEvent, SomeOtherEvent}

I thought it would be nice to match this with the Meta-CQRS ideas you had earlier. If you have any time to pick it up again, it would be interesting to have a chat about this.

Cheers,

Allard

Carlus_Henry · November 17, 2011, 7:29pm

Michael,

Thanks for the quick response. I think my confusion comes in since I am using the annotation @EventHandler on the methods that are processing events. This would mean that the newly created events could potentially be arriving while I am processing historical events. I guess I am curious about an elegant solution to handle this.

With what you suggested, I would allow those events to come in, and put them on some “temporary queue”, then after processing the historical events pull the messages from this “temporary queue”. What happens if while processing the “temporary queue”, more new events come in? The only solution that I can see is that the “temporary queue” just becomes the new queue to read from for eternity. So my annotated @EventHandler method now just moves messages to the temporary queue. I read from the temporary queue to do the actual work.

So now, you get to tell me I am off my rocker. I must be missing something, right?

Thanks
Carlus

Chad_Wilson · November 17, 2011, 7:33pm

Ahh ok then yes, you'd want to feed it all the events in the store.
So is this a situation where you have an existing application in
production, and you want to add another event handler to it? Can you
take production down for maintenance, or does the transition need to
be seamless? If you can stop or pause the flow of "new" events to the
new event handler, you can simply replay the events from the event
store to the new event handler directly. Once done, you can resume/
restart prod as normal and you won't have any missed events between.

The trickier scenario is how to do it hot (e.g. while everything is
still running). Assuming a clustered environment, you could deploy
the new code to one server, have the new event handler keep all the
"new" events in memory until the replay is finished, replay, then
resume. Once done, you could redeploy to the remaining nodes, without
the bootstrapping (e.g. 1 node gets bootstrap code, the others get the
regular code, bootstrap node is replaced with regular code after the
others are brought up).

Am I following you, and does my rambling make sense?

Chad

Carlus_Henry · November 17, 2011, 7:38pm

Chad,

Yes…that is exactly the scenario I was looking for. And you are hitting it right on the nose.

I understand the first solution, and I am interested in how people have tackled the “hot” deploy of a new EventHandler. It is definitely “easier” to just bring the whole thing down, deploy the new event handler, seed it with all of the events and then bring it back up again. Hot deploys, would be the more interesting scenario.

Right now, this is all just brain candy, but I am interested to know how it works in production like environments. If simpler is best, just stop the flow of events, until the new event handler is brought up to date, that is fine. But if anyone out there has had to deal with a “hot” deploy like you are mentioning, I would love to know how you did it.

Thanks
Carlus

Michael_Schnell · November 17, 2011, 9:46pm

Hi Allard,

Actually I have a working prototype that is able to generate a
complete CQRS project structure - It's not really stable and the UI is
very limited, but it's made with the Axonframwork:

http://www.metacqrs.org/metacqrs.swf.html

The basic idea is to model the CQRS structure with CQRS - I think this
is far better than having a simple "DSL" (my first thought) as this
approach allows us to use all DDD/CQRS/ES advantages. Just think about
tracking the changes on your CQRS apps or the possibility to react on
events when something important in the model changes.

Currently I'm in a state of refactoring the prototype because I got
the feeling the meta model is not correct.

I'm going to publish the current meta model as simple "plain Java" DDD
model. This way we can discuss the DDD design itself before I start
refactoring the prototype.

Cheers,
Michael