how dynamic do you want the scaling to be? If your environment is quite static (but just scaled beyond 1 instance), then you could use a similar approach to what Axon uses in the TrackingEventProcessor:
have all nodes receive a copy of each (relevant) event and only process the sagas that match their instance. With 2 nodes a strategy could be for 1 node to handle events for Sagas with even identifiers, and the other node to handle the odd ones.
When using the competing consumer approach, you may run into concurrency issues, even when using (distribued) locks. This is because one event may change the state of a Saga, changing the associations, while another event (that logically comes after the first) is skipped, because these changes are not yet visible. Even if you don’t change association values, this may still occur when starting new instances.
Long story short: to work reliably with distributed sagas, use a TrackingProcessor.