Distributed Command Bus: performance considerations

I have a query on the Distributed Command Bus.
Let’s say I have a Distributed Command Bus connecting my 3 nodes via Jgroups then some of the commands will be re-routed between the nodes based on the Routing Key. What is the performance penalty of this command re-routing between the nodes.

Are there any known statistics around this.

Regards,
Vikram

Hi Vikram,

the most important thing is that there is an extra “hop” in 66% of the cases (33% will hit the correct node right away). How expensive that is, depends on your exact setup (infrastructure and configuration) more than anything else. Where possible, Axon will attempt to reuse serialized forms for messages as much as possible. This means there is very little overhead in (de)serialization.

Unfortunately, I am not aware of any comparisons. For us, doing such routing was not a matter of choice. Not doing the routing would mean there is no possibility to actively cache aggregate state and could result in higher numbers of concurrency conflicts due to simultaneous command handling on different nodes.

Hope this helps.
Cheers,

Allard

Thanks Allard. Makes sense.

One more query… If I have to increase the number of nodes at run time to scale up then I have to create a parallel setup and then bring it online and then retire the old setup. Right?
Or is there a way to dynamically increase the number of nodes in the existing setup.

Regards,
Vikram

Rewording my query... suppose in Production I have 3 nodes of my service connected by jgroups using distributed command bus.
If one more node needs to be added then can it be done without bringing down the services?
Or I have to start a new cluster in parallel and then retire the old one. Please advise.

Hi Vikram,

It’s okay for you system to ad hoc add and remove nodes from the group.
So, you don’t have to close down your cluster while you’re starting up a new one.
I’ll try to give you a little background why it’s okay to do so:

When you use Axon’s JGroupsConnector for command routing and sending in your DistributedCommandBus, the underlying mechanism to decide which nodes picks up a command is done by a Consistent Hash Ring. You can envision the consistent hash ring to be an actual ring, where each node in your cluster gets an equal portion of the ring. When a command needs to be routed to a node, a hash is performed (typically over the Aggregate Identifier, which is what the AnnotationRoutingStrategy by default does in Axon) over a field in the command message. That hash falls on a specific spot in the ring, which as described earlier is pointing to a specific node in your cluster.
Additionally, the consistent hashing algorithm allows for easy addition and removal of nodes, as the ring would just be repartitioned over the new set of known nodes in your cluster.
Thus, you’re not required to close down a complete cluster of nodes if you want to adjust the number of nodes in the system.

Do note that this will only work if all the nodes can access the same aggregates repository.
When event sourcing, this is thus the event store. If you’re not event sourcing, the other aggregate repository should be reachable by all.

Additionally, the consistent hash ring is also implemented in the SpringCloudCommandRouter implementation, which thus also guarantees you that you can add/remove nodes freely.
If you’d implement your own CommandRouter, this is obviously not the case :slight_smile:

Hoping this gives you some insight!

Cheers,

Steven

Hi Steven,

But my concern is more around the Jgroups piece. For instance, right now, when we start our service on Node1 (IP1) then we specify jgroups.tcpping.initial_hosts=[PORT], [PORT] in the start script on each node. And this is how Jgroups connect the 3 nodes.
So, now if I add a node 4 (IP4) at runtime, would it be able to join the existing cluster or I will have to modify each service and specify jgroups.tcpping.initial_hosts=[PORT], [PORT], [PORT] and then restart.

This is more of a Jgroups question so my apologies for bugging you with this.

Regards,

Vikram

Hi Vikram,

the initial_hosts setting is just meant to connect to the cluster. As soon as a connection is established, full cluster information is shared among that connection. In other words: you only need to specify some of the hosts in the initial_hosts setting, not all. However, the more you configure, the bigger the likelyhood that a new node can find a member to join the cluster.

Cheers,

Allard

Perfect. Thanks so much.

Regards,
Vikram