Axon OTel Tracing - spans of event processors are created too late?

I’m using Axon 4.7.4 with Spring Boot 3.0.6.

I’ve noticed that JDBC inserts and updates are not reported as part of traces generated for event processors. I think this is because Axon creates its spans via handler interceptors, which is too late to pick up other things happening in the unit of work.

For example, the constructor of PooledStreamingEventProcessor registers a handler interceptor responsible for creating a linked handler span. However, as hibernate postpones inserts and updates until the transaction commits, those are not included in the processor span. If we, instead, create a span near the creation of UoW, inserts and updates should be included.

I’ve experimented with the idea and modified the PooledStreamingEventProcessor in two ways. First, I’ve changed the registration of the handler interceptor in the constructor in the following way:

    registerHandlerInterceptor((unitOfWork, interceptorChain) -> spanFactory
        .createInternalSpan(
            () -> "PooledStreamingEventProcessor[" + builder.name() + "] ",
            unitOfWork.getMessage())
        .runCallable(interceptorChain::proceed));

Second, to limit the change to a PooledStreamingEventProcessor only, I added the processInUnitOfWork() override in a pooled processor like this:

  @Override
  protected void processInUnitOfWork(
      List<? extends EventMessage<?>> eventMessages,
      UnitOfWork<? extends EventMessage<?>> unitOfWork,
      Collection<Segment> processingSegments) throws Exception
  {
    Span unitOfWorkSpan = spanFactory.createLinkedHandlerSpan(() -> "processInUnitOfWork ", unitOfWork.getMessage());
    unitOfWorkSpan.start();
    try (SpanScope unused = unitOfWorkSpan.makeCurrent()) {
      super.processInUnitOfWork(eventMessages, unitOfWork, processingSegments);
    }
    finally {
      unitOfWorkSpan.end();
    }
  }

After those changes, my traces now include SQL updates and inserts, as seen in the following pictures. The first corresponds to the original Axon behavior, while the second is after my changes.

What do you think?

1 Like

I would highly appreciate any comment on this. From my perspective, seeing projectors’ performance and actual SQL is quite important. But, unfortunately, at the moment, it is impossible to see those in the appropriate context. I can see them only as standalone root traces, which is not helpful as it would be if appropriately contextualized.

Bump. I’d be interested in this topic, too. Any news here?
<3
Rupi

1 Like

Hello @dmurat , @Rupi,

Thanks for your interest and feedback on the OpenTelemetry integration! And of course, for your patience, it seems I missed these messages during my holidays in May.

I like your feedback and approach. There are a few things to improve before integrating it in the framework, but I’m sure we will. When implementing this, I was not aware that queries would fall outside of the Span. It makes sense, though. Live and learn, right?

In your implementation, you create handler span for the UnitOfWork based on its first message. I’d like to amend that to be a root trace for the batch. Called PooledStreamingEventProcessor[name].batch. Then, for each message a span will be created that is linked to the its original span of publishing (if any). This way the links stay the same.

I will start working on this soon. Let me know what you think!

What do you think?

1 Like

All sounds great to me. Tnx for looking into this.

I have decided to create a Github issue and start development into improvements. You can watch Improve Spanfactory configurability · Issue #2780 · AxonFramework/AxonFramework (github.com) fort more updates.

1 Like