Aggregate vs Saga: design rationale

aldibella · January 9, 2016, 3:11pm

Hi,

I am evaluating the use of Axon as part of the implementation of a new application.
Such application will eventually be one of the choreographed services required to automate a distributed collaborative process.
Having read some of the literature on DDD and the Axon documentation, I am struggling to understand the rationale that lead to separation of aggregates and sagas.
Hence, I am not sure on how to best use the two components.

To my understanding, it is responsibility of the aggregate to maintain the consistency of the state of entities contained within it. In my case, the root of the aggregate would be the process itself and state would be held in the root and its children entities (@EventSourcedMember). Changes of state are caused by execution of commands and notified to the outside world via business events.

So far so good (I think) until I would need to integrate with external services and that should be done using a saga.
A saga (I apologise for the over simplification) is a selective listener. Its responsibility to react on events it has been associated with and broadcast commands accordingly.
The problems I see with such approach are the following:

Any non trivial saga will have to be stateful so that historical/state information can be used to decide what to do when an event is received (e.g. commit or rollback an operation, splitting or aggregate events, etc.). Inevitably the saga and the aggregate state will overlap in one or more points leading to a loss of cohesion as responsibility is no longer segregated.
Added complexity. In order to handle external events, sagas have to be implemented to act as proxies to the aggregates.
Would all the above be simplified if events were to be delivered to the aggregate directly? Am I missing something fundamental in my reasoning?

Regards,

Alessandro

Allard · January 13, 2016, 9:17am

Hi Alessandro,

I think both points you mention are valid. It is very normal for Sagas to have state. In fact, sagas without state shouldn’t be implemented using the Saga mechanism, but just a standalone (singleton) event handler. Some state may indeed overlap with the aggregate, but I don’t consider that a problem at all. They both serve a very different purpose.

Regarding the complexity, it may look like Sagas add complexity that may be removed when aggregates handle eachothers events directly. In practice, that’s not the case. You will somehow need to make clear which event needs to be routed to which aggregate instance. That in itself is already complex. If you mix that with the other concerns an aggregate already has, you have a class that focuses on too many things at once.

Furthermore, the purpose of the Saga is to coordinate activities between components. So it doesn’t know about the details of these activities. For example, it doesn’t know how an order is confirmed. It just knows that it needs to happen. It’s up to the Order aggregate to decide if it may happen, and if so, what the consequences for that Order are. The Saga may then see what the consequences for other aggregates are when an Order is confirmed. For example, send commands to Shipping to prepare the shipment, etc. etc.

Hope this clarifies things a bit.
Cheers,

Allard

aldibella · January 13, 2016, 11:44am

Hi Allard,

Thank you for your reply. I think some of my confusion spawns from my preconceived understanding of saga from the database world. Traditionally sagas are seen more as Transaction Scripts (http://martinfowler.com/eaaCatalog/transactionScript.html) designed to handle long lived transaction via compensation actions rather than as process orchestrator. I now see that the semantic of a “Saga” in the Axon design is slightly different.

Please let me draw an analogy with a typical BPM solution so that I can better understand how to use Aggregates and Sagas.

Different BPM engines work in a different way but from a bird’s-eye view the following steps are required:

Create the process definition that prescribes the flow (decision points, tasks, split, joins, states, etc.)
Instantiate the process
Trigger transition via event, command, rpc, etc.
Any tasks between the waiting states are executed by the engine. Tasks take inputs from the the process context and provide outputs to it.
The process waits in the next available state

I appreciate that Axon is not designed as a drop in replacement from a BPM solution but from my understanding a possible solution would work as follows:

Create a saga that prescribes the flow (decision points, tasks, split, joins, states, etc.). Complex Saga could potentially rely on the like of http://projects.spring.io/spring-statemachine or http://activiti.org/ to aid the implentation
Instantiate the process by broadcasting the event that is handled by the saga method annotated with @StartSaga
One of more aggregates are created by the Saga by broadcasting the appropriate commands
Trigger transition via a command to the aggregate that in turn will notify the saga via events. Aggregate takes inputs from the command and provide outputs via the event.
The saga waits until it receives an event to be handled by method annotated with @EndSaga (or it is programmatically ended)

If the above is mostly correct, I would infer the following guidelines:

The saga should only contain the minimum amount of state data required to drive the flow. If no state is needed, event listeners should be used instead of sagas.
The saga is responsible for orchestrating the aggregates but not to execute business actions specific to a task (e.g. send e-mail, invoke external web services, etc.). That is, because in order to invoke the external service, data contained within the aggregate is most likely to be needed.
Aggregates should only keep the state of the task(s) they are designed to manage but not for the overall process (otherwise there will be low cohesion).
Command handlers within an aggregate are responsible to interface with internal and external system (RPC type of integration, IoC, etc.) to fulfill the necessary operations.
Compensation actions are triggered via the saga upon receiving an error event but they are executed by the aggregates

Would you agree with the above or am I still off track?

Thanks

Alessandro

Allard · January 13, 2016, 12:44pm

Hi Alessandro,

your guidelines provide a good starting point for the use of sagas. In point 4, however, be sparse when using external systems in command handlers. It is best when aggregate have all the information they need either as part of the aggregate state or delivered as part of the command. However, sometimes it is just more practical to do a query or call another component… Common sense over rules

Cheers,

Allard

aldibella · January 13, 2016, 4:57pm

Thank you, your input was very useful.