We recently needed to run an event replay across a large percentage of the events in our event store recently and ran into a problem. The migration took a long time to run (since it needed to process a lot of events), which was expected. But what we didn’t fully realize was that BackloggingIncomingMessageHandler keeps an in-memory queue of all the events published since the start of the replay. So a sufficiently large replay on a busy system will eventually cause the system to run out of memory.
My plan is to modify the code so that it discards queued events from the backlog once they pass a certain age. The class already has some notion of “events older than X will probably show up in the replay so don’t bother keeping them in memory” but it is only applied at the start of the replay; I don’t see a good reason why the same logic wouldn’t apply for the duration of the replay.
Any gotchas I should be aware of with that approach?
Obviously Axon 3 doesn’t have this problem at all, but we’re not there yet.
-Steve