Modeling domain events and dealing with integration/milestone events

Troy_Hart1 · July 19, 2018, 4:13pm

As I sit down to write this question I find it hard to not go in a million different directions. I have many questions about how exactly my events should be modeled. That hasn’t stopped me from forging ahead, but I’m often questioning my decisions. The part of our domain that I have built so far is a data input intensive part. It is comprised of two aggregates. One of them is more complex than the other, having a handful of entities and value objects that it manages. I have built my commands and events around the main creation and correction use cases for the aggregates and their related state. So far, the data of both commands and events parallel one another.

For example, I have a command to create a new instance of a Package aggregate, I call this command CreateHeader. It looks something like this:

class CreateHeader { String id; enum type; enum subType; enum[] flags; ... }

When the Package aggregate handles this command successfully it applies the following domain event to the event stream:

class HeaderCreated { String id; enum type; enum subType; enum[] flags; ... }

Then I have a command to correct the Package header details that looks something like this:

class CorrectHeader { String id; enum type; enum subType; enum[] flags; ... }

So then you can probably guess what happens when the Package aggregate successfully processes the header correction command…it applies the following event to the stream:

`
class HeaderCorrected {
String id;
enum type;
enum subType;
enum[] flags;
…
}

`

This basic approach has been followed with the handful of other domain concepts we have implemented so far. It has been fairly effective at addressing the simple use cases we have taken on to date, but I’m not super happy about it. I’m bothered about how closely the events parallel the commands. It just feels a bit naive and seems like it could easily lead to long term maintenance issues. I guess I’m just a bit nervous because this is my first Event Sourced application and I understand that modeling events is the crux, because once an event is persisted it will always need to be supported in the published form (I do understand the role of upcasters, but would love to avoid them as much as possible). I know it’s hard to talk about modeling events outside of the specifics of the domain, but are there any best practices that can be applied here?

Switching gears a little, I’m also curious about the best way to handle milestone or integration events. I don’t know exactly what to call them, but what I’m talking about is a class of events that are not part of the domain state (they don’t belong in the aggregate’s event stream), they exist to communicate across bounded contexts. Considering the domain events described above, both the header created and header corrected events define an array of enum values called “flags”. One of the flag enum values is RESEARCHABLE. In our domain we have a research context and whenever a package’s RESEARCHABLE flag is toggled, this context wants to know about it. So I have created a ResearchabilityToggled event, but I’m not sure how I should raise this event. One option could be to raise this event from within the Package’s query side updater, which is an event handler service that listens for all of the Package aggregate domain events, in order to update the query side state. It seems like this service could be responsible for publishing integration events on the Event Bus, but I’m not sure if this is the right way to handle it.

Thanks for any input!

Troy

Frans_van_Buul · July 20, 2018, 10:32am

Hi Troy,

These are some very natural questions, I had them myself when getting started with event sourcing.

Regarding command/event parallelism:

In my experience, their strong similarity is often an artifact of a application still being simple, at an early stage of domain modelling. As the model becomes more mature, they tend to diverge. For this reason, I would recommend to shy away from approaches like having the command and event inherit from a shared super class or anything like that.
As you rightly point out, the (serialized form of) events are a long time commitment, and if you find out later they’re “wrong”, you may have to use upcasters to compensate. The trick here is to 1) avoid leaking technical details such as package/class names in serialized form (so prefer Jackson over XStream) 2) ensure events have true business meaning. Because if an event represent something that actually happened in the business a year ago, that’s just a fact and it won’t change by future changes in the business or the system. (And looking ahead, for this reason I find a ResearchabilityToggled event a bit suspect, sounds technical/CRUD rather than business.) Commands have a different lifecycle, they may very will change as the business and the system changes. Again a way in which commands and events may be similar initially, but diverge later on, and therefore a reason to not be overly concerned about initial similarity.
If you find yourself repeating certain combination of fields in command and events, it might be a sign that there really is a value object you could factor out and use in both, thus avoiding some repetition.
Another design consideration regarding events you’ll encounter is: do you include the minimum amount of information in the event that’s needed to describe it, or do you include some more information about the (new) state of the aggregate where it took place? There’s no single right answer here for all cases, but in general I lean towards the side of less information, as it makes your events less brittle.

Regarding integration events crossing bounded contexts:

Yes, we do expect that to be a common pattern. For this reason, AxonHub and AxonDB have multi-context support. These contexts are logically separated, but can be bridged by these integration events.
I would generally recommend to have separate components that listen to events from your domain and as a result raise these integration events (or the other way around). This may involve keeping some state. Logically in DDD terms, this acts like an Anti-Corruption Layer (ACL), ensuring that the bounded contexts don’t leak their concepts into one another. I would prefer an approach like this over putting this as an additional task in your query side updater.
Hope this helps,

Troy_Hart1 · July 21, 2018, 2:47pm

Thanks for the feedback!

Less is more often times; that’s very good advice! BTW, I did create another value object!

As I understand it, taking care to not leak details is the key to loose coupling, and is required to enable the transition from monolithic to micro services. So, my application is presently being developed as a monolith. However, it is highly modular. The key for me will be to keep the modules of disparate bounded context free from interdependencies. Where integration events are units of interdependence (they are the vehicle of cross context communication) we must be sure to only depend on their serialized forms. This would mean that bounded context A and bounded context B would both need to provide their own implementation of integration event E. Does this all sound reasonable?

As for the ResearchabilityToggled event. I agree, it would be more than suspect if it were included in the Package aggregate’s event stream. However, as an integration event, it is not suspect if you understand the domain. Specifically, we have a research context and it is interested in researchable packages defined in the package data capture context.

I understand now that I will build a service in the package data capture context that will act like an ACL between its context and research context. This service will respond to domain events from its context and will publish integration events on the axon event bus. Components defined in the research context will subscribe to these integration events. Does this all sound correct?

I still have a question though. Given that my current deployment architecture is monolithic, are the disparate bounded contexts really isolated from one another with respect to the event bus? I guess the specifics of my configuration would dictate that. I could show my configuration, but it is very basic, almost purely default spring auto configuration.

I’m getting started now on some integration similar to the example I was using, so I will just start trying stuff out. Any feedback to keep me on a good course is greatly appreciated!!!

Thanks!

Troy

Frans_van_Buul · July 23, 2018, 8:17am

Hi Troy,

It all sounds reasonable.

Regarding your question “I still have a question though. Given that my current deployment architecture is monolithic, are the disparate bounded contexts really isolated from one another with respect to the event bus? I guess the specifics of my configuration would dictate that. I could show my configuration, but it is very basic, almost purely default spring auto configuration.”

If you set up Axon on Spring Boot with all the defaults, you will get a single CommandBus, QueryBus and EventBus/Store, so there is no isolation of contexts. It is of course possible to create a setup with a local EventStore for a single context, as well as an EventBus/Store for those integration events. I got a question about that on Twitter last week as well, if I have time maybe I’ll create some example set up for this scenario.

Kind regards,

Troy_Hart1 · July 23, 2018, 10:55pm

So I’m confused now. It sounds like you are saying the integration events I am talking about need to be persisted in an event store, which means they need to be applied by some Aggregate defined in the ACL I’m talking about building. I guess I was just thinking this ACL would simply define an event processor…I can’t think this all the way through and wouldn’t know if it should be a tracking processor or simply a subscription processor. But it sounds like this thought is simply not the right way to look at it anyway…

So let me take a stab at what I think you are saying needs to happen. It sounds like you are saying that the ACL will need to have an event sourced Aggregate to persist all the integration events. Keeping with the example I’ve been using, maybe this aggregate would be called ResearchIntegrator? ResearchIntegrator will then also need an event handling service that listens for the package data capture domain events, responding by sending commands to the ResearchIntegrator when appropriate. Then, just like any other event sourcing aggregate, ResearchIntegrator command handlers will apply event to the event store. Finally, the research context will provide an event handler to handle the events published by ResearchIntegrator.

I feel like I’m totally lost in complexity here but I don’t feel like it should be so complex. It seems like the solution requires both aggregates and event handlers but for the event handlers I’m totally unclear about when I should use a tracking processor vs. a subscription processor vs. a saga. Based on the feedback I’ve got from the axoniq folks it seems like sagas are not very popular but tracking processors are…

In an effort to be a little more clear, I would like to describe an abstract scenario via sequence diagram (see attachment). The sequence is not perfectly precise, it is merely attempting to represent a complete flow from a user executing a command in one domain and via integration events a record in another domain is updated.

I’ve had to use some notation to help keep things straight, here’s the key:

A - bounded context A
B - bounded context B
ACL-A - an anti-corruption layer sub-context of bounded context A
Agg- - an aggregate root defined in the bounded context
EH- - an event handler defined in the bounded context
C- - a command defined in the bounded context
DE- - a domain event defined in bounded context
IE - an integration event defined independently in each participating context.

Troy_Hart1 · July 24, 2018, 2:49am

So after some time looking at the sequence, it seems like it works, but there are a lot of moving parts. I am most worried about the event handling services (EH-ACL-A & EH-B). I'm pretty sure they should not be tracking processors though, because they fire commands. Also, I'm very concerned about managing failure in this event driven world. It's just all so new and I find it hard to reason all the way through at times.

allardbz · July 30, 2018, 1:27pm

Hi Troy,

I don’t see why an ACL component couldn’t be backed by a tracking processor. Perhaps you wouldn’t want to do replays on them, but that’s not the only reason to use a tracking processor. In general, tracking processors cope much better with failure than their subscribing counterpart.

Also, the ACL-A, if I understand correctly is a component that translates an internal event to an integration-level event. It should not be occupied with sending commands, IMO. It should just translate the event, and publish that right away. That would mean the EH-ACL-A publishes an event, which is picked up by EH-B directly, which then sends a command.

I understand there are more moving parts, overall, but each part has a very clearly defined responsibility and plays a role in making “it work” at scale. Complex systems (as defined in the Cynefin framework) should be built with simple elements, all easy to understand in isolation. Bounded Contexts are a good way to “stop caring about what’s out there”. It’s just the occasional architect that needs to draw lines across these boundaries…

Hope this helps.
Cheers,

Allard

Troy_Hart1 · July 30, 2018, 2:03pm

Thanks for your insights Allard.

In scenario where the anti-corruption layer (ACL-A) needs to keep track of some state, then it seem like the sequence I depicted is correct. Am I thinking about this the right way or does it seem like I’m still missing some concepts?

Troy

allardbz · July 30, 2018, 2:46pm

Hi Troy,

when the ACL needs state, it can just maintain that somewhere. I don’t think you’d need to add another (event sourced) aggregate for that in the ACL component. That would just make it more complex than minimally necessary. The ACL component isn’t responsible for any type of validation. You could see its state as some internal query model it needs to maintain to ensure it’s able to emit the correct information in events later on. Which also means actually doing a query from that component is a possible alternative to enrich information.

Hope this makes sense.
Cheers,

Allard

Troy_Hart1 · July 30, 2018, 3:25pm

Thanks! That helps a lot!