In terms of designing sagas for best performance, I’ve found the main thing is to keep them small. This is usually pretty easy for a saga that’s managing the operation of a single aggregate like Benoit describes, less easy if the saga needs to coordinate the activity of a number of aggregates. Whenever possible, prefer lots of little sagas over a small number of big ones.
Sometimes you really do need big sagas that coordinate activity across a bunch of otherwise independent aggregates. In my experience, getting good performance out of a big saga boils down to storing as little data as possible. Just what it needs to do its work, no more. Any non-transient fields in the saga class need to be serialized and deserialized for every event, at least in the worst case, and you don’t want to waste CPU cycles decoding and encoding data you’ll never need. Don’t store each aggregate’s full details in the saga if you can do everything you need with just the aggregate ID, for example.
One not-always-obvious thing to watch out for is that data requirements can change over the life of a saga. You might need a particular chunk of data at the beginning of a workflow but not later. In that case it can be beneficial to release the saga’s copy of the data once it’s been used, e.g., by removing it from a hashmap or nulling out the field that’s holding a reference.
Profile first. You don’t want to spend effort optimizing something that is not a significant bottleneck.