Question on Commands

Chad_Wilson · December 5, 2011, 8:52pm

This may be a question for the more general CQRS community: should a
command be only handled by one hanlder? This will obviously make it
more atomic, and therefore less likely to make for a messy cleanup.
Is a two command handler situation a good indicator you need a saga?

For axon, does the command handling logic allow you to subscribe
multiple handlers for the same command?

Carlus_Henry · December 6, 2011, 2:22am

Chad,

Axon doesn’t allow you to register more than one handler for a command. It does, but I do believe that the last command handler that is registered wins. I think I remember reading about that in the section when using AnnotatedCommandHandlers vs. explicit registration.

Regarding the design question, unfortunately like everything else in design, it depends. In this case, I believe that it depends on where your context boundaries are. If you have two separate bounded contexts that you are trying to coordinate between, and you find that you need to coordinate between the two, then a Saga would be a good choice.

While this is a viable solution, it may also be possible that by needing to update two aggregates using one command could be an indication that both aggregates are actually a part of the same bounded context. So where you thought there were two bounded contexts, perhaps there is only one.

Hope that helps.

Thanks
Carlus

Chad_Wilson · December 6, 2011, 3:40am

I have a very concrete example to illustrate my concern. Take a file
system. Each file in the filesystem is it's own aggregate, since it's
contents can be manipulated directly (and from the POV of the file, it
doesn't care where it lives, accessibility and discoverability are IMO
higher level concerns). Then I have a filesystem aggregate which
handles the paths, subpaths, and files therein. When updating a file,
I need to update the filesystem so that the metadata is up to date
(e.g. touch event). This way, the client can get the filesystem data
and immediately know what files it needs to syncronize.

There are some use cases (namely creation & deletion) where having a
single command for two aggregates is useful as each state will
potentially be modified.

Could I represent the entire filesystem as one bounded context? Of
course I could, but I split them up primarily for performance/
scalability concerns, as for clients with large filesystems it is not
reasonable to send & receive the entire contents. ( in an ideal world
client updates would be done via the commands/events themselves, but
that's not possible when connectivity is not guaranteed)

I guess my question is a philosophical one, in CQRS events are tied to
an aggregate, in Axon commands are tied to an aggregate as well.
Should they be? And of course we get to the age old question of
defining just what bounded in "bounded context" means... for your
context!

In thinking about this, one potential solution would be to have an
event listener that consumes the file events, and then fires a
"touched file on filesystem" event. The only problem I would see with
this is the guarantee of the event storing properly (command ==
transactional boundary via UOW).

Either way, I've worked around these issues for my current
application, I was just curious to hear everyone's thoughts on this.

Allard · December 6, 2011, 9:26am

Hi all,

this is a very interesting discussion. First thing I noticed (on the CQRS mailing list as well) is that the term “Bounded Context” is wrongly used. It is a DDD term, that describes a boundary where a specific term has a certain meaning. A flight in one context might be a thing you sell, and has a price, but in the pilot’s context has a departure time, destination airport and a colleague.
If you’re talking about consistency, it’s the aggregate boundaries that play a role. And each aggregate consists of a number of entities, of which on is appointed the aggregate root (Axon’s AggregateRoot interface). That root represents the entire aggregate.

Then, there was the remark that in Axon a command is bound to an aggregate. That’s not entirely true. It is possible to do as many actions on as many aggregates as you like. But if scalability is a big concern, it’s good practice to have a 1 to 1 relation between command and aggregate. The reason for this is that the command handler must load all aggregates into the same machine. With scalability, that means you need to prevent other machines from changing those aggregates in the meantime… the problems are obvious.

So if you have a “composite command” (one that executes on multiple aggregates), there is really no limitation to do so. The UoW logic bound to the transactions will make sure that either both actions succeed, or none of them. So Axon won’t be limiting you there…

Cheers,

Allard

Chad_Wilson · December 6, 2011, 11:28am

You're correct in that I should have said that a command in Axon is
tied to a command handler. Technically you can load as many
aggregates in your handler (via multiple repositories) as you need.
This is what I do in some of my cases (though it does fell kind of
dirty to do so).

The example you provided in regards to flight is an excellent example
of that as well. In the filesystem example, a file is only a
reference (file id), and metadata. For the file aggregate itself,
it's the contents.

Now, on your statement "With scalability, that means you need to
prevent other machines from changing those aggregates in the
meantime", this is where I've been kind of fuzzy with Axon, and CQRS
best practices. Aren't there times, where multiple changes to the
aggregate are possible, or at least not hazardous? If your events are
sequences based upon time, then couldn't you have a basic policy that
states, whatever happened, happened (replay issues aside of course).
I have a use case where clients may be disconnected, yet their state
is still changing, and I want to record this state when connectivity
is resumed. While I could simply state that commands are dispatched
and applied when they arrive, since the client state has already been
recorded by the user, I want to capture their intentions when they
occurred.

My solution was to allow events to be stored at any point in the
aggregate's lifespan (barring before creation). This of course
complicates things on the query side, since you can no longer
guarantee that an incoming event should be applied directly (if it
happened in the past, it may not even need to be applied!). Of
course, barring any business logic constraints, the need to guarantee
that only one instance changes an aggregate at a time tends to disolve
once you remove the requirement that events are sequential moving
forward only.

Thoughts?

Allard · December 6, 2011, 12:13pm

Hi Chad,

let me put my scalability point in perspective. It’s absolutely no problem to invoke multiple actions on the same aggregate at all. This will have absolutely no impact on scalability.

The problem lies in the situation where you need to invoke actions on more than one aggregate. The reason is simple: to do so, these aggregates need to live on the machine that is currently processing the command. If an aggregate lives on one machine, it should not live on another, as you can get into concurrency problems. Although it’s perfectly possible to solve these problems, just by retrying. Your performance will drop tremendously if you do so, but it will work.

So if scalability is a big requirement, make sure your commands are all executable on a single aggregate (i.e. consistency boundary). Side effects should be taken care of asynchronously based on new commands (via a Saga, for example). These other commands may have to travel to another machine to be executed, but it’s not a problem anymore.

In other words, it is not so much an event sequencing problem. It is a data “Availability” / “Consistency” (see CAP theorem) problem.

Cheers,

Allard

Carlus_Henry · December 6, 2011, 12:33pm

Allard,

First thing I noticed (on the CQRS mailing list as well) is that the term “Bounded Context” is wrongly used…If you’re talking about consistency, it’s the aggregate boundaries that play a role.

Yes…I was definitely referring to the Aggregate Boundary and not the Bounded Context. Thanks for taking the time to clarify.

Carlus

Chad_Wilson · December 6, 2011, 7:05pm

I guess I'm still missing the issue Let's take your example of two
machines working on the same aggregate. Can you give an example of
what kind of concurrency problems might arise?

I guess from my current thought process, I tend to look at the
aggregates themselves as eventually consistent, and as such, given the
example of two machines working on two commands for the same
aggregate, the events spawned from the actions of the commands will
eventually become consistent when both stored, and then loaded from
the event store.

Allard · December 7, 2011, 7:41am

Hi Chad,

an aggregate is per (ddd) definition consistent. That means it will always be in a valid state, and when committed, it is either completely committed, or not at all. It will never be half-way. To guard that consistency in Axon, all events generated by an aggregate have a sequence number. The combination of aggregate and sequence number must be unique in the events in the event store.

Now, imagine two machines, A and B, that both load an aggregate Z. They both get a copy of the aggregate, based on past events in the event store. Let’s say they got Z@22 (last applied event was sequence 22).
Machine A is executing a “PlaceBuyOrderCommand” on Z. The result is OrderPlacedEvent@23 and TradeExecutedEvent@24 (because there was a matching sell-order).
Meanwhile, machine B is processing a CancelSellOrderCommand, which results in SellOrderCancelledEvent@23.

Now here is your concurrency problem: they both want to store their events in the Event Store, but only one of them (the happy first one) will succeed. In some cases, you could automatically “merge” the events by saying that the loser would resequence its numbers. But what if the cancelled sell order was the one that matched the new buy order on machine A? Those changes cannot be merged.

Therefore, the “losing” change will need to be discarded, the aggregate reloaded including the newly generated events and the command re-executed.

In a high-speed environment, you want to prevent this. The only way to do so, is by ensuring that a single aggregate instance only lives on a single machine (preferably in the cache). If all commands on a single aggregate are executed on that machine, you don’t have the concurrency problems.

And now comes the last step. If you have a command that executed on more than one aggregate, it is a lot harder to guarantee that each of the aggregates only “lives” on a single machine at any given time. That’s an advantage that sticking to a 1 command to 1 aggregate relation gives you.

Hope that clarifies it a bit.

Cheers,

Allard

Chad_Wilson · December 20, 2011, 3:31am

Sorry I haven't responded yet, I have thought on this a bit, and I
have a few more questions, but will be a little bit more before I can
quantify all the garbage in my noggin Thanks for the clarification
& response!