How scalable Axon Server actually is?

All out of the box functionality that comes by using Axon Framework with Axon Server is wonderful.
However, I struggled to find anywhere exact info on how scalable Axon Server actually is. What are the actual numbers of the standard version? How many events / commands per second? How big is the event storage? How far can you go with the standard version in production?

Does anyone have any insights or experience with this?

Hey Filip,
From what i’ve seen in documentation and youtube videos is that scalability is provided with enterprise edition.

However, as a follow up question: Can we safely run replicas of axon server?,if so which volumes exactly would i need to bind for the event storage

Hi, I have same question related to performance. Actually I have a first load that lunch 500000 events. But I have small performance.

Hi Filip,

Axon Server Enterprise has high availability/fault-tolerance feature on top of the Axon Server Community edition. This means that Axon Server Enterprise can be configured in cluster (for example: 3 nodes). In the case of cluster you can afford to loose some nodes, as others can continue handling messages with zero-downtime. Axon Server Enterprise (cluster) use RAFT algorithm for consensus. Consensus involves multiple servers in the cluster agreeing on values.

In some way this enables horizontal scaling, but I personally believe that more benefits come from vertical scaling (add more memory, cpu, …)

Axon Server community edition can not work in the cluster mode. You can still benefit from vertical scaling, and it would be cool if community can provide this benchmarking, or at least what ‘numbers’ you will be interested to see. I would be eager to provide guidance on defying performance test cases and scenarios.

The first thing you will notice is that Axon Server will not reach its limit. First limits you will reach is with the hard disk space :slight_smile:
Frans wrote interesting blog post, explaining how to optimize CQRS replay, for example. This is usually the place where we put more effort to tune our system (and this is not Axon Server related).

Best,
Ivan

Hi Ivan,

Oh, I see. Alright then! Thank you for your answer.
Thank you for pointing to that blog post as well, I find it quite useful.

One question that comes to my mind is: to prevent a scenario where the server fails and all the message data is lost, does stopping the axon server and copying it’s data the only way to keep a backup or is there a more elegant solution?

Hi Filip,

The ‘most elegant’ solution would be to have the enterprise edition set up, using at least 3 Axon Server nodes.
In that scenario, the events will be stored in all three of your Axon Server instances, thus providing the fault tolerance which ensure you will no loose your events at all.
Unless, of course, all the Axon Server nodes run on a single piece of hardware, which fails for some reason.

Any how, the Axon Server SE solution would require more personal work to ensure that your data isn’t lost.

If desired, you can always have a more face-to-face chat with AxonIQ developers to discuss your specific use case concerns if they do not fit the user group model.

Hope this helps you out Filip.

Cheers,
Steven

(newbie) Can we not configure Axon Community Server with a database that’s horizontally scalable?

Would not this be a solution to the scalability constraint?

What scale are you looking for? An AxonServer cluster can easily manage several billion messages per day, which is generally well beyond what a single application (or set of microservices in the same bounded context) would need.

When it comes to number of messages stored, AxonServer’s speed is independent of the total number of events stored.

Note that clustering of AxonServer isn’t done for scalability, but for availability. If you’re looking for scalability, I’m very curious about the numbers that you’re potentially looking to get out of AxonServer. I’m pretty sure there is a way.

Cheers,

Allard

Wow! Thanks for your quick reply.

I’ve been following your work closely and really admire how you practice. I especially like the discipline of live coding to illustrate ease of adoption, and your joy and humility in so doing – especially when things don’t work out as expected.

I am sold on the stack and will be building my first app for production starting Monday.

What you are sensing in my question is how to obtain scalability and/or high availability without adopting the enterprise server. I work in an organization that makes it impossible to adopt paid 3rd-party services, the value and return-on-investment notwithstanding.

We suffer from “not invented here” syndrome, and right now, we are retreading the road that you – and LightBend’s lagom framework, Vaughn Vernon’s vLingo, and Chris Richardson’s Eventuate – have traveled for the last 5-10 years. We reinvent the wheel with often disastrous consequences.

I’ll have a better idea of scalability numbers for the app we’re starting after we do some analysis next week. I’ll post back here then.

In the meantime, I know that axon community server is vertically scalable. Do you have benchmarks that provide upper bounds on event volume and performance?

Is it possible to introduce some means of failover or redundancy with the community server?

Cheers!

Hi Wil,

thanks for the kind words. I’m doing my best to contribute towards a world that deals with complex software better.

The community server doesn’t support HA unfortunately. That’s where we drew the line :wink:

The alternative would be to use AxonFrameworks’ EmbeddedEventStore, which would allow you to use a database (relational and mongo are supported out-of-the-box) to store the events.

An organization that makes it impossible to use paid 3rd party services? Do you work for a not-for-profit organization? In that case you should definitely contact us to discuss options.

Cheers,

Allard

You’re welcome. I’m a fan and I sing your praises. Keep up the great work.

Sadly, no. There’s really no excuse. I’ll buy you a cup of coffee one day and provide you the details offline.

What I will do however is assert the capability for high-availability. By the time the customer wishes to enhance their deployment, I will let them demand the enterprise version from my organization. They would be foolish to say no.

On another topic, I did not see a more sophisticated getting started example other than the ones that deal with only a single aggregate. A simple example with multiple bounded contexts, multiple aggregates within a single bounded contexts, and different aspects to their context maps would be very useful. Otherwise, newbies will have to figure things out for themselves the hard way.

A great example I will suggest is Matt Stine’s online pizza shop example.

It would be great to see either you or @Frans_vanBuul show us how it is done.

Hi Wil,

Thanks for the blog-post/video suggestion.
I’ll look what we can do at AxonIQ to provide such a thing for the community.

Cheers,
Steven

@Steven van Beelen @Allard Buijze

That would be awesome. Here are some supporting links:

(1) the pizza shop domain; and

(2) an event storm of the bounded contexts.

Showing how this is done in axon would go a long way toward adoption in my organization.

Cheers!