we recently started discussing how we want to model the read-side of our application. Right now we are mostly using a normalized relational schema and we apply views on top of that to return data in a format that one of our clients understands/expects.
Another approach might be to persist the data already in the correct format. This may lead to data duplication if you have to support multiple clients and all of them want (slightly) different data. However it seems that we would “just” trade hard disk/memory space for performance/easy data access. So instead of writing into a normalized schema and applying views on top of that while reading the data back out, we could save the final/expected data format/structure into the database directly. Since we are reading more often than writing data, we are expecting this to improve performance.
Right now, we are thinking about different ways to save this specialized data. The following ideas are floating around here:
- Use something like a ‘ReadModelUpdatedEvent’ which contains the entire data structure of an aggregate. Each event handler is then in charge of selecting the correct subset of that data and persist it into the database. Since we mostly working with JSON data, we can use Jackson @JsonView annotation for that (we could even apply those annotation on our ‘ReadModelUpdatedEvent’). The problem we see with this is that we are basically sending and persisting the current version of an aggregate into our event store every time we change a small part of an aggregate. Some of us fear that this will grow into a problem as our application continues to be used and we add more and more of those big events.
- The ‘ReadModelUpdatedEvent’ only contains the ID of an aggregate. Inside an event handler, we are injecting and using a Repository to load the current version of an aggregate. Again we have all the data and the event handler can decide on its own what to persist and how. The problem with this is, that the returned aggregate is actually not the current version - it’s at the previous version because current event handling has not been completed yet. We tried working around that by using the current unit of work but it failed. In the end we need to have some way to access the latest and greatest version of an aggregate included whatever changes might come in by the current command that is being processed. We couldn’t find any, so we came up with solution 1) - writing the current status of an aggregate into some event and only work on that event using the normal event handling services.
- Instead of having a big and generic ‘ReadModelUpdatedEvent’ we could invest into more fine grained events like ‘ReadModelOverviewUpdatedEvent’ which signals that just part of the data has changed and thus only requires parts of the complete aggregate. Therefore it’s not as big and won’t trouble our event store as much. However this might be problematic if we have a lot of clients which all might require (slightly) different data. In that case we would basically send one event for each client. This might result in a lot of boilerplate/repeated code.
Have people in this list worked on something similar? Thought about similar problems/solutions?