Schema constraints wrong or wrong conceptual model?

Carsten_H · March 6, 2017, 8:49am

Moin,

we took a look at the Axon framework and find it really impressing.
Then we looked at it’s SQL schema and it does conflict with our conceptual model of aggregates and aggregate roots.

Please correct me where I am wrong with the following

A single event store (event store table) can store aggregates from different aggregate roots.
Each aggregate root gets it’s own sequence number which is incremented on each aggregate/aggregate root event.
The aggregate identifier identifies an aggregate inside a single aggregate root.
The aggregate identifies is unique inside a single aggregate root.
The application code is responsible for generating the aggregate identifier.

It is our understanding that aggregate identifiers of different aggregate roots are independent from each other and that
each aggregate root opens a new “namespace” for identifiers.
That means in our opinion that for each event the tuple (aggregate root identifier (type), aggregate identifier, sequence number)
is unique but the tuple (aggregate identifier, sequence number) is not because different aggregate roots may use identical ids and use identical sequence numbers.

But the domain event table (public.domainevententry) has the following unique constraint:

“domainevententry_aggregateidentifier_sequencenumber_key” UNIQUE CONSTRAINT, btree (aggregateidentifier, sequencenumber)

And the table public.snapshotevententry has the primary key:

“snapshotevententry_pkey” PRIMARY KEY, btree (aggregateidentifier, sequencenumber)

We expect that this leads to constraint violation because the aggregate identifiers are not independent between aggregate roots

because the type is missing. And the type seems also not to be used as a selector when querying the database.

What do we miss?

Greetings,
Cal

Allard · March 6, 2017, 9:56am

Hi Cal,

your 5 statements are not entirely correct. It probably has to do with the concept “aggregate” and “aggregate root”, so let me clarify those:
an “Aggregate” is a group of entities that are considered as one unit with regard to data changes. In other words, they are “atomic”. The “Aggregate Root” is the entity within that aggregate that acts as an entry point to the entire aggregate. When keeping a reference to the aggregate, you essentially reference the Aggregate Root.

About the 5 statements:

A single event store can store multiple aggregates of multiple types.
Each aggregate is given its own sequence number which is incremented on each event it applies
The aggregate identifier identifies an aggregate instance in your application
The aggregate identifier is unique in your application
correct.

Aggregates of different types should not share an identifier. They are different aggregates (because of a different type) and should therefore also have a different identifier.

Hope this clarifies things.
Cheers,

Allard

Carsten_H · March 6, 2017, 10:12am

Hi Allard,

I don’t meant that aggregates of different types should share the same identifier.
But they may so without intention IMHO.
If I have a modular system with modules each responsible for a different aggregate type they should IMHO
not depend on each other regarding selection and value of their aggregate id.

But in the Axon framework they actually have a dependency because of the constraint.

My point “4” is wrong as written it should read

The aggregate identifier is unique in your single application domain (aka aggregate type)

Does that make sense?

Cal

Hi Cal,

…

Carsten_H · March 6, 2017, 10:38am

Hi Allard,

I see one misunderstanding of myself now.
I set “aggregate root” == “aggregate type” and that’s wrong.

But I still don’t understand why an aggregate id must be unique among different aggregate types
if the aggregate ids are defined by my domain?

Why should I not use my “account ids”, “billing ids”, “part ids” as aggregate ids for my
aggregate types “Account”, “Billing”, “PartCatalogItem” resp. ?

Hoping to reach a better understanding,
Cal

Carsten_H · March 8, 2017, 2:20pm

Hello?

Can anyone from the Axon Team shed some light on why you force
aggregate ids to be unique over different aggregate types?

Two unrelated aggregate types may not use the same sequence of characters for
their aggregate id.

Thank you very much,
Cal

Jorg_Heymans · March 8, 2017, 3:27pm

Guess it makes it easier to uniquely identify ARs , as in you only need one handle (id) to find the object rather than two (id + type) ?

Steven_van_Beelen1 · March 9, 2017, 12:31pm

Hi Carsten,

We not necessarily force the aggregate identifiers to be unique among aggregate types, we force the aggregateId column in the database to be unique among aggregate types.
Not including an extra column is a performance upgrade index wise compared to the option you suggest: (aggregate root identifier (type), aggregate identifier, sequence number)
To get a similar solution whilst not adding a column to the DomainEventEntry yourself, you could for example postfix the aggregate type to your aggregate identifier, and store that in the aggregateId.

I’ve seen that solution been used more than once, for example on the project I’m currently working on.

Hope this helps.

Cheers,

Steven

Carsten_H · March 9, 2017, 1:34pm

Hi Steven,

thanks for answering.

Hi Carsten,

We not necessarily force the aggregate identifiers to be unique among aggregate types, we force the aggregateId column in the database to be unique among aggregate types.

Whats the difference between this regarding consequences?
If I have 2 aggregate types User and Friend and both have by incident the same aggregate id x@y.com the database will not let me store them and raise a constraint violation.

Not including an extra column is a performance upgrade index wise compared to the option you suggest: (aggregate root identifier (type), aggregate identifier, sequence n0umber)

Sorry, I don’t buy that. if you stick with btree indices and most aggregate ids are uuids i expect the opposite because of increased locality if it has a mesurable impact at all when compared to adding the type yourself. if you care for performance on that level i would first start making the event table smaller by extracting e.g. the type and class columns into some key tables.
(i started using PostgreSQL even before it got that name.)

Steven_van_Beelen1 · March 9, 2017, 3:53pm

Hi Carsten,

Yes, very strictly speaking, if you’d use for example a sensible default like generated UUID’s, there’s definitely a change you’d get a clash sometime, although the chance is very slim.
Or, if you’d have two similar implementation on per aggregate package to generate aggregate ids, that might happen too.
But if you’d take the example I gave you by post fixing you’re aggregateId with the type, you’d get user-{aggregate-id} (in your example user-x@y.com) and friend-{aggregate-id}.
Thus, if the aggregate id your implementation sets for that User and Friend aggregate happen to be the same, you wont get the violation you’re pointing out.
Fair enough, I’m not the one to tell you if that’s correct or not since that’s definitely not my specialty. Maybe I rephrased it incorrect, but in short it is the reasoning Allard just gave me why it’s set up the way it is.
So, maybe I should have gave you my response regarding the why like this:
“Not including an extra column has been a performance upgrade decision index wise compared to the option you suggest: (aggregate root identifier (type), aggregate identifier, sequence n0umber)”

Again, hoping to be helpful here.

Cheers,

Steven