Eventstore is growing...

After some calculations we are counting on an average of 2 million domain events per day. With a XStreamEventSerializer we average about 1Kb / event (we have been looking into hessian serializers but those will only save us a few bytes).
After some multiplications you can see that the event store will grow a lot in one day.
My question is; how do people tackle these kinds of problems? Compression on the database is one alternative.
Any input on this would be appreciated.

Hi Alan,

this is an interesting case, that all successful implementations will eventually encounter. Personally, I haven’t had this problem yet. Databases do generally not have a problem storing several TB of data.
Though in one project we had to face with large snapshot events. There, what we did was create a wrapper around the XStreamEventSerializer, which was capable of reading GZipped streams. It’s quite easy to do. Just read the first 2 bytes, and check whether they are the GZip header. If so, unzip and deserialize. If not, deserialize normally.
Given the fact that XStream builds text based streams, GZip makes you stram about 10 times smaller.

Another option is to apply sharding in your database. If you shard based on the aggregateIdentifier field, you can run each query on a single server.

A third option is to archive old events as a single stream (and then gzipped). You can remove them from the database and store them on a different server (or on tape :wink: ). Accessing that data does become harder, so only do this if you don’t expect to have to read that data regularly.

And finally, there is the option to change the serialization mechanism altogether. XStream allows you to specify a different writer; the BinaryWriter is a bit more compact. But custom (DataInput/OutputStream based) serializers are much more compact and more performant. But they are an extra burden on the developers.

Your question did set me thinking about a solution for this in Axon. An EventArchiver mechanism of some sort that would make this a bit easier to configure.

Hope this helps.

Cheers,

Allard