Recovering from losing saga state database

Hi, we are using Axon with event store.

We have sagas processing events to issue commands, which in turn publish further events (and/or perform actions such as sending an email). The issue we have is related to idempotency of the commands issued by the saga, if for some reason we lose the database that stores saga state how can we recover from this?

We cannot simply replay all events to rebuild the database as the further events that have already been published will be published a second time (or emails etc re-sent).

If we have a backup of the DB it is extremely likely that it will be out of date causing the same issue as above for events that were issued after taking the backup but before the DB was lost.

Has anyone encountered this scenario and if so how did you solve the problem?

Regards

Andrew

Hi Andrew,

I’m newbie myself in CQRS/Axon but I suppose there is no way out the box to solve such issue.
You have to implement your logic accordingly and somehow avoid unnecessary double work like sending emails and etc.
I think framework should give some way to recognize does it restore state of aggregates by replaying them or is it new one.
Also I would not expect to sending such events to Saqa again.

If you lose current Saga state I think you should live with it or have additional logic to restore it, I don’t think there is another way.
For example if I have ecommerce application and I lost the Saga state by place order which will include Cart, Payment & Order aggregates - user will have to do Place Order again when system came back
and your support team will have to deal with previous one if for example payment have been done but order didn’t created.

Guys,
Who have more expirience please correct me.

Thanks,
Evgeny Kochnev

More thoughts, in my case described before we can generate some unique id and save it at cart state, so by place order you will send it to your payment/order aggregate.
if it’s failed and you even lose your saga state - when user will do place order again you will send the same generated id and usually third party payment system will support idempotency too.
So, you will make payment again or not if it’s already happened and the same approach with order/post-order systems.
Regardless it’s CQRS or REST it would be the same approach to avoid double charge or double-order creation.
If you don’t lose sage state it will be replied If you don’t lose sage state it will replied automatically by restoring Saga.
If you still want to do the same when you lost saga state - I think you will have to store some additional flag at your cart sate that there was attempt to place order - and have some scheduler/job which will retrieve such Carts on application recovery and reinitialize sagas for them but does it worth it?

Thanks,
Evgeny Kochnev

Hi Andrew,

first of all, losing a database is a major problem. I doubt any architectural style will come to the rescue in such case.

However, when using event sourcing, the fact that you have the events does probably provide you with an option to recover your data. You could replay past events to your sagas, but then on a specially wired instance of the application that ignores all sent commands or invocations of external components. If your saga state didn’t depend on the replies of these external invocations, then you will be able to reconstruct its state.

If you lose this type of data (and I sure wish you’re not asking this question because you accidentally executed a TRUNCATE on production :wink: ), then you’re in the same ‘trouble’ as you would have been with any other type of architecture… Having an Event Store gets you a long way in recovering state, but not always all the way…

Cheers,

Allard