we are using Axon Server SE via standard docker image provided by AxonIQ. We run that image on GKE. The issue is, sometimes our Axon Server just errors out with various Java Heap out of memory errors. The most recent one was:
2022-09-29 13:11:33.887 INFO 1 --- [grpc-executor-4] i.a.a.logging.TopologyEventsLogger : Application connected: rivile365-masterdata, clientId = 1@masterdata-769757699f-xwrk8, clientStreamId = firstname.lastname@example.org, context = default java.lang.OutOfMemoryError: Java heap space Dumping heap to /data/java_pid1.hprof ... Heap dump file created [251948801 bytes in 3.505 secs] Exception in thread "http-nio-8024-Acceptor" java.lang.OutOfMemoryError: Java heap space
I have heap dump ready if anyone interested but from what I gather this happened during one of our microservices was redeployed (maybe during serialization phase?). These error are very common on microservice deployments when new pods are connecting to Axon Server and old ones disconnect one by one.
Anyway, I was wondering how to prevent this? Is there a documentation on recommended JVM settings for heap/off-heap memory? I didn’t found anything in the documentation. Does the memory usage of Axon Server depend on the amount of microservices connected and if so how? How memory usage depends on concurrent message count or message size?
I have also found that those distroless docker images run with standard Java settings, which means that Axon Server can use only 25% for heap of all available memory. In my opinion this could be way too low, especially considering that I didn’t observe any memory decreases during server operations. Is there some kind of process of “garbage collection” running on Axon Server at all?
Axon Server container has 768MB memory resource limit defined. We have ~15 Spring Boot microservices connected and maybe a hundred different commands, queries and events.