Event Handler called twice with the same event

Hi everybody,
I’m facing some problems using TrackingProcessors.

Seems that sometimes, some handlers are called twice.
It happens expecially if the handler method is very fast, this made me suppose it is a critical race issue.
It happens only where there is a single event to process.

My axonframework version is 3.1, spring boot version 1.5.7.RELEASE.

I’m using an EmbeddedEventStore persited on MySql Database.
The event handler method doesn’t persist anything on db.

Temporarily, I solved the problem using an interceptor:

`


@Slf4j
public class PreventEventDuplicationInterceptor implements MessageHandlerInterceptor<EventMessage<?>> {

    private String lastEventIdentifier = "";

    @Override
    public Object handle(UnitOfWork<? extends EventMessage<?>> unitOfWork, InterceptorChain interceptorChain) throws Exception {

        Message<Object> message = (Message<Object>) unitOfWork.getMessage();
        String eventName = message.getPayloadType().getSimpleName();
        String eventIdentifier = message.getIdentifier();
        unitOfWork.onCommit(u -> lastEventIdentifier = eventIdentifier);

        if (lastEventIdentifier.equals(eventIdentifier)) {
            log.error("Duplication in " + Thread.currentThread().getName() + " event " + eventName + " " + eventIdentifier + "[" + this + "]");
            return null;
        }

        return interceptorChain.proceed();
    }
}

`

Anyone else is experiencing the same issue?

Thanks
s

I am also seeing the same behavior with axon 3.1.1 and spring boot 1.5.9.RELEASE. I’m not sure if my processors or SubscribingEventProcessor or TrackingEventProcessor. I believe they are subscribing because I’ve not specifically configured the processors…I’ve simply configured the aggregate repositories. I’m still totally new to axon so there are many disconnects in my understanding/comprehension.

I’m simply debugging aggregate tests (through AggregateTestFixture) and when I put a break point in one of my @EventSourcingHandler methods I notice that the same event is coming in twice in response to a single command.

Hi Troy, Sara,

in tests, events do come in twice, and that’s by design. After executing the test scenario, Axon will recreate the aggregate another time, this time based only on events, to verify that the state of the aggregate is identical. Axon uses this to ensure state is not modified (accidentally) outside of EventSourcingHandlers.

Sara, with “it only happens when there is a single event to process”, do you mean that the processor is just starting and processing a single event, or it is still polling at the head of the event stream, and then a single event arrives?

Cheers,

Allard

Ah, thanks Allard! That makes sense.

Hi Allard,

I mean that the processor is still polling at the head of the event stream, and then a single event arrives.
I hope to have some time during this weekend to try to reproduce it in a sample project.

Thanks
s

I have READ-COMMITTED isolation level
BTW, It seems not related to isolation level, it happens also with repeteable read and serializable.

Hi Sara,

are you using the JpaEventStorageEngine or the JdbcEventStorageEngine?

What I’m curious about, is where the duplicate appears. It sounds like you’re able to reproduce it. Can you share that code?
It could show whether the duplicate appears in the EventStorageEngine itself (returning the same event as part of 2 batches), or whether it is the EmbeddedEventStore, where an issue arises in the combining and switching of streams.

Thanks.

Allard

Hi Allard,
I’m using the JpaEventStorageEngine.

I’m trying to find a way to reproduce the case. Seems that if the new events are more than one, it doens’t happen.
I suppose it happens when the size of the last batch is = 1 and only if the processor is behind global cache.

As soon as I find a way to make it replicable I send you the code.

Many thanks

Hi Allard,

I’ve isolated what happens in EmbeddedEventStore for a single processor thread affected by the problem.

When I have 2 or more events:

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)
TOKEN not updated - returned event at line 395
Returned Event: 38a9179f-69d3-416a-8e18-0456deb3743e (Tracking Token GapAwareTrackingToken{index=526485, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)
LastToken updated at GapAwareTrackingToken{index=526486, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]}
Return event at line 383
Returned Event: 03945f7e-9c5f-4ea8-85e5-17a43a3dc550 (Tracking Token GapAwareTrackingToken{index=526486, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)
LastToken updated at GapAwareTrackingToken{index=526487, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]}
Return event at line 383
Returned Event: 2db46854-351e-4d00-8610-19b63e2f6fe3 (Tracking Token GapAwareTrackingToken{index=526487, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)
LastToken updated at GapAwareTrackingToken{index=526488, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]}
Return event at line 383
Returned Event: 97ab94e8-5d57-414c-977e-14d25f9c1864 (Tracking Token GapAwareTrackingToken{index=526488, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)
LastToken updated at GapAwareTrackingToken{index=526489, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]}
Return event at line 383
Returned Event: 45cbfaec-49c5-4c51-96f8-b5edd053f2f8 (Tracking Token GapAwareTrackingToken{index=526489, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)

When I have a single event

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)
TOKEN not updated - returned event at line 395
Returned Event: 9b73fe58-5269-4d8c-acb4-f71e28ea2580 (Tracking Token GapAwareTrackingToken{index=526492, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)
JpaEventStorageEngine.fetchTrackedEvents => Result size: 1. (lastToken = GapAwareTrackingToken{index=526491, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]}
LastToken updated at GapAwareTrackingToken{index=526492, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]}
Return event at line 383
Returned Event: 9b73fe58-5269-4d8c-acb4-f71e28ea2580 (Tracking Token GapAwareTrackingToken{index=526492, gaps=[526212, 526213, 526219, 526228, 526252, 526336, 526341, 526342, 526403, 526404, 526407, 526429]})

Called peekPrivateStream (allowSwitchToTailingConsumer = true)

Called peekPrivateStream (allowSwitchToTailingConsumer = false)

In this case the same event is returned twice.
I hope this can be helpful.

Thanks
s

Hey there,

Did anyone find the cause for and a solution to this?

I just stumbled on the issue yesterday as well. Turns out we’ve been having the issue for months in one service, but since it is idempotent enough we never realised it. Yesterday it cropped up in another service which was not idempotent. For that service, the issue disappeared after removing the gaps from its TokenEntry and restarting the service (not sure which of the two actually fixed the issue), but we haven’t tried finding the cause yet.

We’re on Axon 3.0.7, using an EmbeddedEventStore with the JPAEventStorageEngine and experiencing the problem both with our Percona and Aurora DBs, with transaction isolation READ_COMMITED and REPEATABLE_READ respectively.

/Sebastian

Yes, we found the solution and fixed it in the 3.1 branch. Can’t really look up the issue number right now, but updating to the latest version will resolve the problem.

Cheers,

Allard

Hi Allard,

On our production system we have recently encountered strange errors with our Saga which is using a tracking processor. The actual cause remains unclear but based on the stack traces we had the impression certain events where being handled twice…

Can you confirm the Axon issue number which fixed this bug?
Secondly, is there any chance this bug can be back ported to the axon-3.0.x branch?

Thanks,
Benoît

Hi Benoît,

The issue Allard’s talking about is in 3.1.3, this one to be exact.

We’ll be releasing a 3.1.3 very shortly, as we’ve encountered a couple of other minor issues which need quick fixing.

So, that’s for confirming it’s fixed and that you can use it soon enough.

For porting it back to 3.0.x: isn’t there an option for you to update to a newer version of Axon?

Especially 3.1 has some nice added features, like multithreadedness for your TrackingEventProcessors and the QueryBus.

Otherwise waiting for 3.2 is also an option (which will obviously contain the fixes from 3.1.3 and below).
We’re planning to release that end of month.

Cheers,

Steven