Axon Server: Query scalability


My application consists of 5 client apps and 1 Axon Server running in Kubernetes. All clients run as 2 instances, Axon Server is a single instance Standard Edition. (Yes, I know a clustered enterprise edition would be better, but we are a startup:-).
All versions are latest, Java 17, Kotlin, ARM processors.

We make heavy use of queries, which works well from a functionality perspective so far, but we run into scalability (performance and stability) issues now.
I think I understand the Axon setup pretty well now (~2 years in production), but I’m still struggling with the right configuration for Axon Server and the clients to get good query throughput and stability.

We have query peaks at some times. There may be hundreds or thousands of queries started at some point in time. Most queries are fast, but there are also some slow queries (> 10 seconds). As data volume and system load increases, we get more and more issues with queries (stream errors from Axon Server, disconnected and restarted clients). It simply feels like the Axon query functionality does not scale (well). Maybe we have to redesign parts of our system because of this flaw/bug in Axon? Hopefully not!

Currently, I do not know how to approach these issues. I could not find meaningful hints in the logs and metrics provided by Axon Framework and Axon Server (I use the “standard” Grafana dashboards for Axon Server and clients).

  • What are the metrics I should look for?


  • What are the config settings to play with?


  • I know there is “query-threads” for server side.
  • But what about client side? Do I have to set query threads on both server and client side?
  • The docs do not tell much about those settings.

Message size:

  • I know that the message size is 4MB by default, and we already increased to 8MB because some queries could have large result payload.
  • But how is the max-message-size configured properly?
    • Is it required that all clients and the server use the same max-message-size?
    • What happens if client and server use different settings?
  • Nothing in the docs about this.