Handling concurrent commands/transactions

Anna_Bildyug1 · December 5, 2018, 12:57pm

Hi,

We are faced with an issue related to concurrent requests/transactions that leads to a problem with consistency in our view model. Data in the aggregate is fine and we have all events saved, but Event Handlers don’t work like we’d expect.

Our case:

We send two commands of the same type to the same Aggregate
Axon handles them sequentially, so we have remains consistent within the Aggregate
First Event get’s processed by the Event Handler and data was committed
Second event gets processed, but the Event Handler reads data without changes from step 3. Which means changes from step 3 get overwritten.
Normally this flow is fine, but in our case we have a non-idempotent operation. The order of the events doesn’t really matter, but we shouldn’t overwrite data.

We found that flow looks like:

Send Command
Start Hibernate transaction
Start processing of the Command
Apply the Event
Handle the Event in the EventSourcingHandler
End processing for the Command
Start handling the Event in the Event Handler
End handling the Event in the Event Handler
Commit the Hibernate transaction

For two threads flow looks like

Thread1: 1-2-3-4-5-6----7-8-9----------------
Thread2: ---------------1-2------3-4-5-6-7-8-9

Looks like we have this problem, because of the length of the transaction and second transaction started before first was committed, it means second transaction doesn’t use the updated data. Could you please give us some comments about the behaviour? Currently we are not sure how to solve this issue properly.

I also attached some logs. Maybe they will be helpful.

Our setup:
Axon version: 3.4.1
Spring Boot: 2.0.3.RELEASE
DB: MySQL with isolation level READ_COMMITTED

Thanks in advance,
Anna

concurrency_logs.txt (158 KB)

Steven_van_Beelen · December 13, 2018, 10:44am

Hello Anna,

I’d like to ask a couple of questions to get a better grip of your set up:

Are you using Subscribing or Tracking Event Processors?
Are you using the Axon Spring Boot Auto Configuration?
Are you instantiating a ‘org.axonframework.common.transaction.TransactionManager’?
If yes on bullet 3, are you registering your TransactionManager to the CommandBus by any chance?
If yes on bullet 3, are you registering your TransactionManager to the EventStorageEngine by any chance?

So, my hunch is it lies somewhere with the TransactionManager, but I am obviously not sure yet.

The Axon auto configuration will create a SpringTransactionManager bean if a org.springframework.transaction.PlatformTransactionManager.

This will then in turn be used on the CommandBus and EventStorageEngine automatically.

I hope we get to a solution to your problem quick Anna!

Cheers,
Steven

Anna_Bildyug1 · December 13, 2018, 1:19pm

Hi Steven,

Thanks for your reply.

Are you using Subscribing or Tracking Event Processors?
We are using Subscribing Event Processors
Are you using the Axon Spring Boot Auto Configuration?

Yes

Are you instantiating a ‘org.axonframework.common.transaction.TransactionManager’?
No, but we call method transactionManager.executeInTransaction in one place to execute some logic when application starts.

Could you confirm whether or not our scenario is expected behaviour or does Axon have something in place to prevent this from happening?

Thanks,
Anna

allardbz · December 13, 2018, 3:08pm

Hi Anna,

it’s actually not weird for the second thread to start on step 1, while step 3 is forced to wait for the first process to finish. This has to do with the Repository requiring a lock on the aggregate.
However, having the isolation level set to read_committed, should have any updates made by the first thread be visible to the second. It that’s not the case, it looks like you are using repeatable read (which is the default for MySQL) instead.

How/where did you configure read_committed isolation level?

Allard

Anna_Bildyug1 · December 13, 2018, 3:18pm

Hi Allard,

We set property to change isolation level for our db to read_committed

datasource:
hikari:
transaction-isolation: 2

Without the changes we wasn’t even able to handle commands, we had exception

Stack trace: org.axonframework.commandhandling.model.ConcurrencyException: An event for aggregate […] at sequence [57] was already inserted
at org.axonframework.eventsourcing.eventstore.AbstractEventStorageEngine.handlePersistenceException(AbstractEventStorageEngine.java:165)

Thanks,
Anna

allardbz · December 13, 2018, 3:51pm

In that case, I don’t understand why an event handler would not be able to see the changes, while the event store is.
The order in which things happen, clearly show that thread 1 committed everything well before thread 2 starts handling the event. The fact that the transaction started earlier shouldn’t matter, as you’re using read_committed.

What do you event handlers look like?

Allard

Anna_Bildyug1 · December 14, 2018, 8:16am

Hi Allard,

This is our Event Handler.

@EventHandler
public void on(FileAttached event, @Timestamp Instant timestamp) {
    ViewModel viewModel = viewModelRepository.findById(event.getId().getValue()).get();
    ItemViewModel itemToUpdate = viewModel.getChecklist().findChapter(event.getChapterType()).findItem(event.getItemType());
    List<FileViewModel> files = itemToUpdate.getFiles();

    files.add(new FileViewModel(event.getFileId().getValue(),
                                event.getFileName().getValue(),
                                VerificationStatus.AWAITING_VERIFICATION.name(),
                                OffsetDateTime.ofInstant(timestamp, ZoneOffset.UTC)));
    viewModelRepository.save(viewModel);
}

Important to mention:

viewModelRepository.findById - implementation of standard CRUD repository
viewModel.getChecklist() - checklist is stored as json

@NotNull
@Type(type = "json")
@Column(columnDefinition = "json")
private ChecklistViewModel checklist;

I agree that it’s weird that Transaction doesn’t see committed changes, but looks like it’s the case.

Anna

Anna_Bildyug1 · December 14, 2018, 8:40am

I added some logs to check what size of list we have before and after save. It shows that second transaction actually doesn’t see committed changes

2018-12-14 08:33:34.911 WARN 6674 — [nio-8080-exec-7] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-7 - EventHandler - START
2018-12-14 08:33:35.009 WARN 6674 — [nio-8080-exec-7] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-7 - EventHandler - FILES 0
2018-12-14 08:33:35.193 WARN 6674 — [nio-8080-exec-7] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-7 - EventHandler - FILES AFTER SAVE 1
2018-12-14 08:33:35.193 WARN 6674 — [nio-8080-exec-7] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-7 - EventHandler - END

2018-12-14 08:33:35.952 WARN 6674 — [nio-8080-exec-8] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-8 - EventHandler - START
2018-12-14 08:33:36.033 WARN 6674 — [nio-8080-exec-8] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-8 - EventHandler - FILES 0
2018-12-14 08:33:36.220 WARN 6674 — [nio-8080-exec-8] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-8 - EventHandler - FILES AFTER SAVE 1
2018-12-14 08:33:36.220 WARN 6674 — [nio-8080-exec-8] e.c.platform.query.kyc.KYCEventListener : http-nio-8080-exec-8 - EventHandler - END

To be clear AFTER SAVE I do get from repository again.

int sizeAfterUpdate = getViewModel(event.getKycFileId())
        .getChecklist().findChapter(event.getChapterType()).findItem(event.getItemType())
        .getFiles().size();
log.warn(Thread.currentThread().getName() + " - EventHandler - FILES AFTER SAVE" + sizeAfterUpdate);

Anna

allardbz · December 14, 2018, 9:06am

Hi Anna,

are the changes visible for handlers that started their transaction after the commit of the first handler?

A minor detail (don’t think that’s the issue): the transaction-isolation property is expected to be the name of the isolation level, not a numeric value.
From the docs: “Set the default transaction isolation level. The specified value is the constant name from the Connection class, eg. TRANSACTION_REPEATABLE_READ.”

You can also have Hikari print all its actual settings. I’d recommend doing that.

Allard

Anna_Bildyug1 · December 17, 2018, 10:37am

Hi Allard,

The same code works perfectly fine if we send request not concurrently.

Thread1: 1-2-3-4-5-6-7-8-9----------------------
Thread2: -----------------------1-2-3-4-5-6-7-8-9

I attached logs to demonstrate it.

Actually setting transaction-isolation by int or by String does not make any difference. At the end it converts the value to int anyway.
I printed Hikari settings:

2018-12-17 10:14:15.688 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : HikariPool-1 - configuration:
2018-12-17 10:14:15.690 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : allowPoolSuspension…false
2018-12-17 10:14:15.690 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : autoCommit…true
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : catalog…none
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : connectionInitSql…none
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : connectionTestQuery…none
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : connectionTimeout…30000
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : dataSource…none
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : dataSourceClassName…none
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : dataSourceJNDI…none
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : dataSourceProperties…{password=}
2018-12-17 10:14:15.691 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : driverClassName…“com.mysql.cj.jdbc.Driver”
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : healthCheckProperties…{}
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : healthCheckRegistry…none
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : idleTimeout…600000
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : initializationFailFast…true
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : initializationFailTimeout…1
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : isolateInternalQueries…false
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : jdbc4ConnectionTest…false
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : jdbcUrl…jdbc:mysql://localhost:3306/local-platform?useSSL=false&nullNamePatternMatchesAll=true
2018-12-17 10:14:15.692 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : leakDetectionThreshold…0
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : maxLifetime…1800000
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : maximumPoolSize…10
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : metricRegistry…none
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : metricsTrackerFactory…com.zaxxer.hikari.metrics.micrometer.MicrometerMetricsTrackerFactory@6d3f1fc5
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : minimumIdle…10
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : password…
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : poolName…“HikariPool-1”
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : readOnly…false
2018-12-17 10:14:15.693 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : registerMbeans…false
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : scheduledExecutor…none
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : scheduledExecutorService…internal
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : schema…none
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : threadFactory…internal
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : transactionIsolation…“TRANSACTION_READ_COMMITTED”
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : username…“root”
2018-12-17 10:14:15.694 DEBUG 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariConfig : validationTimeout…5000
2018-12-17 10:14:15.694 INFO 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting…
2018-12-17 10:14:15.842 INFO 10867 — [ost-startStop-1] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.

Anna

non-parallel-calls.txt (172 KB)

allardbz · December 18, 2018, 11:26am

Hi Anna,

there is some information about MySQL’s behavior in case of READ_COMMITTED, which explains that the implementation slightly different of what you might expect:
https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html

Apprarently, it is not guaranteed that READ_COMMITTED will return the last committed state, unless you use explicit locks.

In the past, we used to wrap the connection pool in a LazyConnectionDataSourceProxy, to prevent connections from being opened until a first load was executed (as opposed to when a transaction starts).

Hope this helps.

Allard