TokenEntry creates orphaned large objects in Postgres

klauss42 · May 2, 2022, 9:57am

Hi
I am using Axon framework latest version (4.5.9) using Axon Server and Postgres. Spring Data/JPA is used to create the Axon tables tokenentry, sagaentry and associationvalueentry in the database. All projections are store in the database, the system runs fine so far.
Now I discovered that there are a lot of orphaned large objects in the Postgres database and it looks like these orphaned large objects are primarily caused by Axon’s tokenentry and sagaentry tables. Both tokenentry and sagaentry use @Lob annotation and this creates large objects in Postgres. I also learned that Postgres does not automatically remove referenced large object entries from Postgres’ internal pg_largeobject table (PostgreSQL: Documentation: 14: F.20. lo).

I can cleanup those orphaned large objects using the Postgres tool vacuumlo (PostgreSQL: Documentation: 14: vacuumlo) but this feels just like a workaround.

Question:
Are there any recommendations or experiences how to deal with this issue?

Thanks
Klaus

Yvonne_Ceelie · May 2, 2022, 10:02am

Hi Klaus,

I wrote a blog about the topic and you can find it here

Kind regards,
Yvonne

klauss42 · May 2, 2022, 3:20pm

Hi Yvonne
thanks for the hint to the blog article.
It’s a quite important topic for Postgres users, that I did not see in any Axon docs before. The problem is that you do not see the issue in the first place because the system just runs fine. You just start your application using Spring Boot/Data and the Axon entities are created using @Lob be default. This leads to a database with tons of garbage because each token update inserts a new row in the pg_large_object table. In my case the database had already ~50 million orphan rows in the pg_large_object table and AWS RDS runs into trouble when trying to upgrade Postgres to newer versions.

So I think you should put this info in a more prominent place in your docs:-)

Klaus

Yvonne_Ceelie · May 2, 2022, 3:57pm

Thanks for the recommendation Klaus, we are going to follow up on your suggestion.

KR,

Yvonne

vaelyr · June 16, 2022, 10:13am

I was aware that these large objects are left behind, but then I tried to use vacuumlo to cleanup. In LIVE database there are more than tens of million of orphaned objects, but in DEV database I had around 700k. So I ran the tool on my DEV database and unfortunately it corrupted the entire database by even removing references to real existing objects which caused unrecoverable data loss. So running this utility seems very dangerous. Vacuum ran more than 2 hours for such small data set and the application was running at the same time. Fortunately I was able to use point in time backups from Azure to recover to valid state.

Default setup for Axon on Postgres is not very scalable longterm when using lobs. Database size is exponentially growing each day because of the orphaned objects.

Currently testing solution to migrate to either bytea or jsonb based types.

vab2048 · June 25, 2022, 3:19pm

@vaelyr you can check out my work in progress repo which uses JSON for the JdbcEventStorageEngine and JdbcTokenStore (see the Axon config). Unfortunately it is not yet possible to configure the JdbcSagaStore to use JSON so easily but there is an open issue for it.