Data warehouse vs query store per service (within a single bounded context)?

Hi all,

I want to open a discussion on the trade offs of two approaches of the query side (within a
single bounded context):

  1. Have a single data warehouse which contains all of the view models for the bounded context.
  2. Have each service within the bounded context maintain their own DBs.

Like all situations, the answer to what you should do is always: “it depends”, but I wondered
what the community’s opinions are on either option.

I am dealing with a situation where: (i) Axon is not currently being used, (ii) there is no
event sourcing (things are state stored in tables), (iii) there is a mix of ways of messages
being sent (HTTP/Kafka events) between services.

I am leaning towards 1 (having a single data warehouse) for now just because it will be easier
to do with my current constraints. Also it will avoid some of the data integrity issues that
will undoubtedly occur because there is no event sourcing.

If Axon was being used and things were event sourced would people reccomend 2?

Regards,

Hi, I would opt for option 3 (unless I misunderstood your option 2)…

Tradionally, systems are very focused on data. This is why we build them on top of large relational data models that validate the data according to their constraints. When introducing an event-driven (or better: message-driven) approach to those systems, it is often unfeasible to immediately switch aware from this relational paradigm. Instead, you can embrace it and start modeling certain projections differently.

And this is where my “option 3” comes in. When designing projections, I prefer not to see them as part of a service. Instead, I see them as serving information to a specific target audience. It may be that this audience is clearly defined and overlaps with that of a certain service, but it may also be very different.

I like to see them as “isolated”, just listening to events in order to build a projection that is capable of answering the specific questions this target audience has. The deployment model of this projection can be chosen very independently of the other components that exist around it. It’s a question of trade-offs.

Hope this helps.

2 Likes

A few years later… I find myself googling for the same situation and come across this topic which I started :upside_down_face:

The deployment model of this projection can be chosen very independently of the other components that exist around it. It’s a question of trade-offs.

@allardbz I wonder if you could please expand on this point.

What deployment models are there exactly? And what are the trade offs?

I’m in a situation where a number of different services effectively all need the same materialized view/projection. I would rather not have to implement the logic in every service - a change would require mass coordination and stop independent releases. How would you best deal with this scenario?

What I meant with the comment, is that a query model will simply react to events to update the information, and share that information when it processes a query.

It doesn’t really matter whether that component is deployed as a completely separate component, or if it’s part of the deployment package with other components. It still behaves exactly the same way.

The trade-offs are mainly around the non-functionals. If deployed separately, you’ll need to manage another deployment unit. If deployed as part of another component, then that other component becomes a bit bigger and each deployment will also need to include this query component. Even if it didn’t change.

Whichever option you choose, it’s relatively easy to switch. After all, the interaction is done through messages, which don’t care if they need to cross deployment boundaries or not.