We have a situation where we are sending a query to a service A to get some data.
However we are doing another service B to replace this one and it will listen the same queris as service A.
What we would like to know is if there is a way to turn off/turn on a query handler, like a feature flag.
For example, we querying service B but if we find a bug on it, ew want to be able to turn it off and use service A without downtime. And after fix the bug, return to service B without downtime as well.
And we would also like to know if it is possible to have this kind of mechanism for event listeners and commands handlers.
This seems more a deployment/infrastructure problem than a development problem to me.
I think disabling message listeners might not be such a great idea from design point of view. You certainly will run upon a moment in time in which you forget to enable/disable certain listeners/handlers (especially dangerous for event listeners) with all undesired and strange behaviour as a result.
It think it is a safer and more reliable approach to deploy service B and take service A offline. If you discover a bug on service B, you can relatively fast redeploy service A rolling-back to the previous logic. Since you mention it is a query service, I guess no much harm can be done. I assume your services are running as Docker containers making the downtime short when re-activating an older container to replace the new one.
If you really need no downtime at all. A suggestion worth evaluating could be to spin up service B and use an API gateway to route all traffic to service B before you take service A offline. If you discover a bug on service B, redeploy service A again and reroute all traffic to service A again at the API gateway and take the buggy service offline again. It involves a certain risk though… Keep in mind e.g. for query handlers (from the docs):
In case multiple handlers are registered, it is up to the implementation of the Query Bus to decide which handler is actually invoked
Multiple query handlers for the same point-to-point query will very likely result in unpredictable behaviour.
Thanks for your answer and sorry to take so time to give you a reply.
About you comment about disabling message listeners, I agree with you, it is too dangerous.
For the queries we came out with this solution.
Every service has a parameter in the database that tells if this service is enable or disable.
This parameter is read from a query interceptor and if the service is disable, we put a flag in metadata with value false; if it is enable that flag is true.
Then in the service that did the query (the client), we created a new QueryHandler that wrap the DefaultQueryHandler.
We overwrite the query method and instead of doing a normal query, we use the Scatter-gather query.
Here we receive the responses from both services (the enable one and the disable one), we filter it and then we return only the response from the service the is enable.
We thought also to scan all classes before the spring boot initialisation and remove the QueryHandler annotation from the methods. Like this those queries will not be visible for axon but this solution is too much evasive because the byte code of the class needs to be changed.
Seems a much more elegant solution to me as well.
Although I do see some potential issues/points of thought I’d like to point out. I’ll mention them just to allow you give them a thought as well.
- What happens if for some reason two versions of the same service are enabled in the database? This solution asks for some strategy to pay really close attention to “flag management” (which you certainly thought about in the first place). I see a risk of returning data duplicates in these circumstances…
- Before any service can operate, it needs to query your “flag database”. I assume you decided to build a separate service with its own database for this purpose. What will happen if this flag-management-service becomes offline or responds to slow due to heavy system load? I can imagine this service will receive lots of requests. There seems to be a synchronous call between two services which introduces a dependency from all of your business related services to this flag management service. This can be fine in an application architecture, but you need to be aware of the potential risks/issues this introduces.
- You always query all service versions (scatter-gather approach) to filter out only the enabled service’s data to be returned to the caller. Doesn’t this introduce a major (and in this case useless) system load since you more or less throw away 50% of the returned data (in a two service version scenario)? In other words, you query for data of which you know you won’t be needing them (a bit like the YAGNI principle )
Thanks for your thoughts.
The solution that we implemented, if the query gateway receives more than 1 valid response, it is chosen only the first one (the one that arrived in first place). We assume that we have always this well configured (hopefully ).
For your second point, the flag service lives inside of the service. We decide to store the flag in database because like this we do not need to restart the service (at the beginning we implemented this as an application parameter). We still have the risk of lose communication with database but probably the query would also fail if that communication is broken.
For your last point, you are right. We are duplicating always the query and we only care one of them. However ,this solution is to solve a deployment problem where we introduce a new service that will replace a old one. With this we can switch between them easily. If no issues were found, in future releases the query handlers for the old service can be removed.