How to do projections

bernhard_s · April 25, 2017, 4:38pm

Hello!

I’d love to hear your opinion on how to implement projections. I’m using Axon 3 and EventSourcing with TrackingEventProcessor. My read-models live in a relational database.
When it comes to building up the read-model via projections, all is simple as long as the events contain all the data needed for the projection. But when a table in the read-model combines data from two or more aggregates things get more involved. This may occur when denormalizing for performance reasons, for instance.
In my research so far, I found two approaches.

Approach A:

Enrich events from aggregate A with data from aggregate B. The event-listener that does the projection then has all the data it needs and can simply execute an update/insert statement.

Approach B:

To get data that is not in the event, the projection (event-listener) needs to query tables of the read-model. And with that data and the data from the event it can execute the update/insert statement.

What I don’t like about approach A is that events containt data that is tied to the read-model. I would need to change the events when I change the read-model. This kind of coupling seems wrong to me.
What I don’t like about approach B is that it makes projections more complex. Also I think it requires the events to be projected strictly in order. Otherwise the queries against the read model could assume data that is not yet there. And maybe even fail due to missing records, or foreing key violations. Does Axon’s TrackingEventProcessor guarantuee that order of events?

Which approach would you use? Is there another, maybe better, one?

Thanks for your input!

Steven_Grimm · April 25, 2017, 4:56pm

To handle a similar situation, we have the event listener handle events from both aggregates and either do partial writes to the read model for each event or, where that’s not possible because of the semantics of the data, keep a local cache so when the second of the two events arrives, it has the data from the first at hand.

The former approach is more resilient to the application restarting in between related events and is the one we usually choose, but it has lower performance since it needs to do more DB writes.

-Steve

bernhard_s · April 25, 2017, 5:41pm

Thank you Steve.

As you mentioned, the local cache is problematic when restarting the application. Hence my idea with querying the read-model.

Did you experience any problems with foreign key violations? Or don't you have any foreign keys on your read model? I think this relates to my question whether events are always projected in order.

Allard · April 27, 2017, 2:25pm

Hi Bernhard,

in query models, unlike certain tables are very strongly related and updated by the same handler, I tend to stay away of foreign key relationships between tables. I let the application manage these values, instead of the database.

Cheers,

Allard

Steven_Grimm · April 27, 2017, 5:15pm

We avoid foreign key relationships between our Axon-populated tables not just because the insert order is undefined in the steady-state case, but because if we choose to rebuild a table using an event replay, when we clear out the existing rows we end up either having to disable the constraints (which means the replay has to know about all of them) or blow away all the child rows and rebuild those tables as well (which means the replay takes much longer).

This has so far not caused any problems for us, though it’s a slight shame that not having the foreign-key constraints means that database GUIs have less metadata to help our non-SQL-fluent business people construct their queries. We’ve considered adding and then disabling but not removing the FK constraints, but haven’t tried it yet.

We do, however, have foreign key relationships between Axon-populated tables and tables we manage using a more traditional CRUD approach. Since we never rebuild those CRUD tables and we control the insertion order, the issues above don’t come up.

-Steve