My use-case is that a remote process (JMS consumer) will receive an Axon generated Event (in reality JMS message) which should trigger a Saga so that I can perform the business function robustly.
I’m planning on using AsyncSagaManager with JpaSagaRepository as I want it to be performant and fault tolerant. We cannot afford to loose messages and business function should be robust (sagas shouldn’t disappear).
Couple of question related to failure scenarios:
- Looking at AsyncAnnotatedSagaManager:handle implementation it appears that we can loose a Saga (i.e. not persist in DB).
a. We have set JMS consumer acknowledgement to transacted (or client) so that we control when an ack is sent back for a message
b. The container event listener calls the above method - a new saga is created and put in disruptor queue for async processing
c. An ack is sent back because the event listener returned (default behavior of transacted)
d. Machine crashes and the saga (in the queue) is lost since it wasn’t persisted.
e. Also, it appears that the Saga is persisted only after the first event (@StartSaga) is handled which appears to be wrong (vs persist it before invoking the event).
How can the above scenario be made more resilient?
It appears that the Saga is persisted only after an event is handled. Is there a way to persist the state of saga in increments within a single event handler?
If the machine crashes while the sagas were running, does the saga manager restart those running sagas? If not then how is the application supposed to handle this?