Eventhandling in batches

Hello,

How to handle batches if your event handler writes into a non relational database, like elasticsearch?

Elasticsearch support batching you calls, and gives much better performance. However the event handler process the events still 1 by 1 in code.

Is there a way to hook on the transaction manager to bundle the calls into a single call to elasticsearch per batch?

Kind regards,
Koen

1 Like

Hi Koen,

first of all, you’ll need to make sure that the batch size for the Event Processor that is processing these events is not the default value of 1. A pragmatic value could be in the 10s to 100s, but probably not much more.

But as you said, Event Handlers are still invoked one-by-one, for each event individually. To make these methods “batch-aware”, you can use the UnitOfWork. Simply define it as a parameter on your @EventHandler annotated method to get it injected.

What you’ll need to do, is register a resource on the UnitOfWork that captures all the batched operations and make sure an action gets executed in the UnitOfWork's onPrepareCommit phase.

Here’s a snippet that shows how this can be done:

    @EventHandler
    public void on (MyEvent myEvent, UnitOfWork<?> unitOfWork) {
        List<Object> batch = getBatch(unitOfWork);
        batch.add(new SomethingToStoreInElastic());
    }
    
    private  List<Object> getBatch(UnitOfWork<?> unitOfWork) {
        // there is no concurrency on a UnitOfWork. Therefore the getOrComputeResource 
        // can be used to add logic to be executed only once
        return unitOfWork.getOrComputeResource("someUniqueKey/" + this.toString(), k -> {
            List<Object> listOfBatchedOperations = new ArrayList<>();
            // if we're in this method, the list is created for the first time. We register a hook to commit our batch
            unitOfWork.onPrepareCommit(u -> {
                // TODO persist the batched operations
            });
            return listOfBatchedOperations;
        });
    }

What the exact contents of the List look like, or whether a List is the best option, well, depends :wink: . You might want to use a data structure where it’s easier to recognize multiple updates for the same documents so that they can be merged into a single update.

Hope this helps.

1 Like

Great, this is exactly what i need!