Saga design

Hi all.

I’m back to my PoC that by now looks quite OK. The “reading” (and “processing”) part is pretty done using AggregateRoot and Event Sourcing. Now I’m focusing on the “writing” part, where I plan to use Sagas.

Client <----ES+AR-----> Axon Application <----Saga----->Server (DB/WS/Rest,…)

I did a first draft of a very simple Saga, basically it only sends data to a Rest service and controls success/failure/timeout. But now I’m trying to “complicate” this example by invoking several services, control all their success/failure/timeouts and coordinate the final actions (nothing if everything is OK, a bunch of Compensation Actions if any of then fails).

For this I have to keep some kind of state inside the Saga, know how to determine the success/failure/timeouts and how/when to invoke compensation actions.

I have looked to all examples I could find but they are very simple, practically only use timeouts of other scheduled events. I see nothing about coordination and even less Compensation Actions that are fundamental part of Sagas as described in the Garcia-Molina/Salem paper - “(…) each saga transaction T1 should be provided with a compensation transaction C1.”

Moreover, browsing thru the Axon code I don’t see also no specific support for this concerns - success/failure (it does have for timeouts), coordination, Compensation Actions. In my opinion (that again is of someone who little experience in DDD et al.) at least Compensation Actions should be explicitly defined by the framework, and even maybe a “default” handling mechanism for it.

So is there any examples of “complicated” Sagas, or best practices in Axon?

Thanks for all.

After reading some more stuff, including posts on DDD/CQRS group (a good group but hard for newbies), I’m trying to wrap my head around how to model a AbstractSaga or SuperSaga. Such Saga would have the following responsibilities:

  • listen to a StartSaga event

  • send commands

  • for each command
    — listen to timeout events
    — listen to success events
    — listen to failure events
    — handle “Verification” Actions
    — handle Compensation Actions

  • coordinate all commands and end with a success/failure for all the Saga.

Looks very verbose to me to have to create 5 events/actions for each single command the Saga sends but I don’t see any other way of coordinate all the commands.

Let me try to clarify with a common example. My Saga will send 4 commands, all at the same time:

C1- commit local data
C2- invoke external WS (book a car)
C3- invoke external REST (book a hotel)
C4- invoke external REST (book a flight)

I do not have any control on the external services, I only know how to invoke and receive responses if that’s the case.
For C1 I do have control and I know it will throw a success/error event.

So I sould have:

C1: is easy, it can be done in a pure ACID style so I just send and wait for it.

  • schedule a timeout event T1
  • listen to Sucess event S1
  • listen to Error event E1
  • send the command
  • call the coordinator COO

C2:

  • schedule a timeout event T2
  • listen to Sucess event S2
  • listen to Error event E2
  • send the command
  • call the coordinator COO

C3:

  • schedule a timeout event T3
  • listen to Sucess event S3
  • listen to Error event E3
  • set a Verification Action V3
  • send the command
  • call the coordinator COO

C4:

  • schedule a timeout event T4
  • listen to Sucess event S4
  • listen to Error event E4
  • set a Verification Action V4
  • send the command
  • call the coordinator COO

On T1 use a retry 10 times policy, throw E1 or S1 accordingly
On S1 set C1 = true
On E1 set C1 = false

On T2 throw E2
On S2 set C2 = true
On E2 set C2 = false

On T3 use a retry another hotel policy, after 10 times throw E3
On V3 throw E3 or S3 accordingly
On S3 set C3 = true
On E3 set C3 = false

On T4 throw E4
On V4 throw E4 or S4 accordingly
On S4 set C4 = true
On E4 set C4 = false

COO each time is invoked will check

  • if C1 is “limbo” do nothing

  • if C2 is “limbo” do nothing

  • if C3 is “limbo” invoke V3

  • if C4 is “limbo” invoke V4

  • if all C1-4 are true end saga with Success

  • if all but C2 are true end saga with Success(“Sorry, please use public transportation”)

  • if any C1,2,4 is false

  • if C1 is true invoke CA1

  • if C2 is true invoke CA2

  • if C3 is true invoke CA3

  • if C4 is true invoke CA4

  • end saga with Error

Does this make sense at all? Note that I’m talking about a “general” or “abstract” Saga that has the boilerplate for this kind of scenario.

Thanks again.

Hi Antonio,

you don’t need an event for every single possible outcome. You can also have the saga send out a few requests, and use Java Futures to wait for the outcomes. In that case, you can do all activity (error, timeout, etc) in a single @SagaEventHandler method.
Typically, you’d use timeout events for things that last multiple minutes or even days. For example a payment that must be received within 30 days. Using them for short lived timeouts like an external call would be a little overkill.

Hope this helps you in your design decisions.
Cheers,

Allard

I don’t know about that idea of put everything inside a single handler, that would mean having a single handler per Saga?!?!? I don’t see that working well with coordination and orchestration of services (as per “saga is commonly used in discussions of CQRS to refer to a piece of code that coordinates and routes messages between bounded contexts and aggregates”). I’ll try to investigate further.

Note that my intention was to create a boilerplate abstract class that would do all coordination and orchestration in a “default” manner, so the coder will just have to write a bunch of handlers (probably annotated with some meta-stuff), that could obviously be extended/overridden at will. I actually think that would be a nice addition to Axon.

Regarding the timeout my idea was to have a way to have the client (UI) be informed of the result synchronously or, if the response took more than 1 min. for instance, to return a “please check the result later” eventually with a link to the possible result (returned by the “Verification” Actions).

Hi,

you don’t put everything in a single handler, but everything that is triggered by the same event. Saga’s react on things that happen in your domain by listening to events. Then, they send commands to other components/systems. Some systems use command objects, while others use web services, method invocations, etc, etc…

If you want to notify the ui of something, use an event handler just for that. It reacts on events and either sends information directly to the client, or stores it somewhere for the client to retrieve it.

The idea of building an abstract, generic component sounds nice, but do realize that software that can do everything is really software that can do nothing. The power of ddd is in the fact that you build a model that focuses on a very specific problem at hand. Solving more that one problem will give you a complex and hard to maintain solution, that is likely to solve the problem at all.

Cheers,

Allard

Hi Allard,

A command can trigger a saga, and one might be interested in the end result of the saga and not just the immediate result of a command. It does seem like one would need to listen to all possible end-events.

For rollbacks and cancellations, one would also need the saga to listen to failed cancellations and do retries. It’s very tedious to hand code the retry logic. It’d be great if there were a “spring-retry” equivalent for retrying cancellation commands specific to the Saga’s logic. I know there’s a command retry that can be configured in the command gateway – but again, we might be interested in the end-event for the command.

Hi Sofia,

the “technically possible” things tend to get a bit mixed with the “conceptual things” here, but I’ll do my best to answer the question.

Conceptually, when sending a command, you want confirmation that your command was accepted. That doesn’t mean the end-result is already exactly as you expected it. For example: TransferMoneyCommand would confirm that the transfer was initiated, but doesn’t guarantee that the transfer has been executed. If I understand your concern correctly, you want to be able to be notified when the transaction is completed.

Conceptually, the best way to do this, is via an event. But you say that you would have to listen to “all possible end-events”. A short answer would be: yes. But you make it sound like there are a lot of possible ways to end. In that case, let’s go back to the definition of an Event: “a notification that something important happened in the domain”. What’s important, depends on many factors. In your case, the fact that it has completed is already important. In your event design, you should take that into account.

On way to do that, is by defining a hierarchy of events. Have an abstract MoneyTransferCompletedEvent. Implementations could be MoneyTransferSuccesful and MoneyTransferFailed. Perhaps not the best example, but let’s take it as it is. Event handlers can then choose whether they listen to the abstract event, not caring how it finished, or listen to only a specific implementation, if they want to know more about the why. That way, a handler can choose what level of detail it is interested in, and will not explicitly need to listen to many different types.

Lastly, you could use a view model to get the status. In Axon 3.2, we are currently working on Subscription queries. You could use that mechanism to receive updates on specific state changes in a view model. Until then, polling could be an alternative.

Hope this helps.
Cheers,

Allard