I carefully read your blog post about Set Based Consistency Validation.
As far as I understand Axon Framework ensures that the commands of an aggregate instance are not handled concurrently. Conversely, it’s possible (for good reasons) that different aggregate instances handle commands in parallel.
The linked blog post suggests to create a look-up table which is updated using a subscribing event handler. Think about two aggregate instances which check for a certain entry in this table at the very same time. At the moment they execute the query the value is not contained in the lookup table…
When the first command handling is finished the transaction is committed and the entry is put into the lookup table.
When the second command handling is finishes the transaction is committed and the entry is put into the lookup table. The result: Either we have two entries (what we wanted to avoid) or we update the old entry (which is also not the desired behaviour).
We can mitigate this race condition by introducing a unique database constraint. Having this the second commit is rejected due to a constraint violation. Some technical runtime exception is thrown and the command sender has to react on this.
What I wonder: Does it really make sense to check for the existence of an entry in the lookup table within the command handler? As it’s not sufficient to rely on the query we need to have a unique constraint in the database anyway… and the command issuer has to handle this failure scenario anyway.
So what’s the benefit of checking for the existence of a certain value by using a query? The only secure way is “try and error” by inserting a value and catching a Constraint violation exception.
Do I miss something in my thoughts?
This is indeed the way to handle it. Updating the blog and code samples is still on my to-do list, Thanks for your heads-up.
we struggle with exactly the same problems. First of all, I would ask (as senior/lead developer) my product owner if the uniqueness requirement is really that necessary. In our case, we are build a distributed (like a peer to peer distributed) system, where checking for uniqueness at one peer node is not enough because the same value might enter the system at a different peer node by a different user at the same time. Once the peer nodes sync up, the query model database would end up with a constraint violation. Our strategy is to allow for such duplicity but provide a duplicity detection service that would notify both users to remedy the situation. As an example, imagine vehicle plate numbers which must be unique across all peer nodes. However, users make mistakes.
Another point on this topic is, how would you check for the uniqueness should you implement the application in non-DDD/Axon world? I can imagine two requests come to a controller method in two threads, controller delegates to a service method (still two threads) which in turn writes to a database. First request to successfully write to a database wins and the second fails due to an SQL constraint violation exception. Basically you offload the constraint check to a database engine. Otherwise you would need some kind of thread synchronisation at the service method level.
Let me suggest a very silly idea. Let’s have an aggregate representing a “global registry”, which is not event sourced but state based, and is a singleton. Command handling in aggregates provides locking and synchronisation for you. So in your command handler, use a command gateway to send a new command to register that supposed-to-be-unique-value to the registry aggregate. If the command handler of the registry aggregate stores the value, then your command handler may continue its logic and publish an event. If the registry call fails (exception on duplicate), then your command handler probably also fails, which bubbles up to the controller which in turn sends a 4xx response to the client.
That’s exactly what we do. The downside of this approach is that you get a very technical exception which indicates that a database constraint is violated. Unfortunately this exception is thrown on commit so you cannot catch it in your aggregate command handler, but have to handle a persistence specific exception in the command sending component which ja likely to be a rest controller. Ugly.
I do not get why this is better than just inserting the key into a table with a unique constraint. Either in the aggregate’s command handler method or in a subscribing event handler.
You can always catch and wrap a generic SQL exception in your own exception. All SQL engines provide a list of error codes from which you can deduce or imply what went wrong and act accordingly.
Sure, but the exception is thrown when the transaction is commited… so you need to handle the exception in the component which is not part of the transaction…
Again, as you said: You need to ask: Do I really need this strong consistency? Maybe it’s ok that you have some duplicate email address for a moment and it’s better to compensate it afterwards… that’s a business decision for sure.
The idea behind a separate aggregate to handle duplicates is that aggregate command handlers are synchronised by Axon Framework, thus you don’t need to do anything on this part.
However, the primary purpose of this aggregate is not to mimic database behavior so that duplicates do not happen. On the contrary, the aggregate is lenient (by design) to allow for duplicates but provide a detection and correction mechanisms. The reason I suggested this idea in the first place, is that we are building a sort of decentralised distributed system where duplicates may happen on different nodes, so database constraints are not an option for us. And also we have to check for duplicates in business attributes such as vehicle plate number. The issue we face here is, that a user might have entered a vehicle plate number with a mistake. The mistake goes unnoticed (its just a number) for some time until another user tries to enter a new vehicle plate number which unfortunately matches the mistaken number entered previously. This would totally prevent the user to enter correct data into our system rendering it a bit unusable.
You might argue that typos in vehicle plate numbers do not happen, but imagine serial numbers carved in metallic products. These numbers become blurry and much less readable over time and people tend to make mistakes.
In our product, we face a tricky situation … do not allow duplicates and do not prevent users to enter correct data … on a distributed system