Using SimpleEventScheduler for our application gives Saga in continuous wait for Saga Creation

Hi Allard,
Need your help in understanding the issue with our application. We see application works fine for some time, after that we see continuous wait for saga execution. Event handlers are also free, but still we commands are not coming to the application.

Thanks,
Vijaya

in-mem-quartz-2-vote.PNG

in-mem-quartz-1-vote.PNG

Hi Vijaya,

can you tell me which version of Axon you are using, and which version if the disruptor you are using (dependency is probably transitive via Axon, but may have been overridden).
Is it correct that you have configured 5 concurrent executors on the async saga mangager?

Cheers,

Allard

Hi Allard,
Here are the version details
Axon - 2.3.1
Disruptor - 3.2.1

Yes for saga events processing 5 concurrent executors are configured while normal events processing is done using 30 threads.

It is noticed that internal scheduled events are also pushed to Disruptor, because of which we see after some time application needs to wait for Saga disruptor slots to be made available. I don’t think increasing disruptor size too big will help, but we need to see how fast events to be processed so that space should be available for new events.

Please find attached one more screenshot for the hotspot issue we are facing.
Please comment.

Thanks,
Vijaya

in-mem-quartz-3-hotspots.PNG

Allard,
Just want to update that

<<It is noticed that internal scheduled events are also pushed to Disruptor, because of which we see after some time application needs to wait for Saga disruptor slots to be made available. I don’t think increasing disruptor size too big will help, but we need to see how fast events to be processed so that space should be available for new events.

This observation is if we go with Quartz scheduler for scheduler events processing.

Thanks,

Vijaya

To be more precise for quartz scheduling on backend side we are using Oracle 11g.

On processing side application is quite simple. Event handlers basically either insert data in Oracle DB or push JMS request to some internal components for further data needed for request processing. And event handlers registered in Saga, majorly responsible for setting timeouts using scheduled events. These timeout values on average is not more than 5 sec.

We found degradation in application performance if we go with Quartz scheduling with Oracle DB, as compared to if we go with SimpleEventScheduler. It is observed that Application processing slowly as the load duration increases. Concurrently we are firing only 15 requests in a second but still we see response time for application processing goes on increasing.

Do you have any recommendation/ suggestion. Something on configuration or application side can be done to improve the performance and sustainability.

Please suggest.

Thanks,
Vijaya

In the screen shot attached scheduler events are using In Memory scheduling (SimpleEventScheduler).

One very strange behavior noticed is that after some time when the load is pushed to the application, we see delay in processing on application side. Due to which after some time messages picked up from listener, but not pushed to proper application node via jgroups.

We don’t see these messages are processed, even we stopped pushing new requests to application. On profiler side we don’t see any deadlocks but all threads including jgroups, saga are in waiting only.

Application process is running but not processing any request.

Hi Vijaya,

the async saga manager uses the Disruptor to handle events in the threads available.
It seems that some threads are blocked waiting for a “vote” on a saga event handler that “may” create a new instance. Since an instance should only be created when no existing sagas for that event are found (default behavior), threads need to wait for eachother to ensure the process is correct.
If you know beforehand that an event should always create a new instance, you can indicate so on the @StartSaga annotation. That will work around this problem.

Meanwhile, I am trying to understand a bit more on the circumstances in which this is happening. For my understanding, do you have an indication on the number of events you’re firing at the Sagas?

Cheers,

Allard

Hi Allard,
Here are the details needed

In application we do have @StartSaga for event handler responsible for Saga creation. We have total 11 Saga event handlers including Scheduler events and Events initiated because of Commands.

Also want to understand how to maintain the affinity for scheduler events to be processed by same node, who has created the job. Currently scheduler events are configured as Quartz jobs.

Thanks,

Hi Allard,
We tried by upgrading the disruptor jar to the latest available one, but still we see application is blocking/waiting on Saga level and not taking any request ahead. This issue is not noticed immediately, but it comes after some time while application is running.

Do you have any recommendations on configurations side to avoid this issue.

Also want to check the number of Saga configurations per SagaManager instance.

Thanks,
Vijaya