Monitoring Axon applications

Curious to hear what people have found works well for monitoring production Axon applications from an operations perspective, as distinct from a business process perspective.

There are a bunch of JVM-level things you can monitor independently of Axon, but have people found it valuable to keep track of Axon-specific statistics like event / command rates, number of active sagas, number of scheduled events, states of aggregates, etc.? If so, what does your monitoring setup look like? Do you maintain an operations-centric read model in a database alongside the read model you use for your business queries, do everything with log analysis, or something else?

In one of my projects, we use Metrics to record this kind of information. We let Metrics dump the values to a specific log file, which are then visualized in Splunk. We also expose the metrics using a URL on each service, which is used by some custom script that validates times during the load tests.

Cheers,

Allard