Raft behind warning

I am encountering this warning:
Node is behind, 118026 events have been committed but not applied, in group: default

I find this warning a bit surprising, as I have a single-node setup. Is there some CLI command that could remove that warning?

Hi,
AxonServer uses Raft consensus to guarantee consistency in a cluster.
To easily allow adding nodes to an existing (even single-node) cluster, it initializes with replication in all cases.
This means that even single-node installations run Raft.

Given that Discuss is for community support, we are severely limited in the kind of support we can provide in this case.

That said, it might help to briefly go over how the event write process works, as that explains what the error message means exactly: AS writes an event to its replication log, and once a majority has done so, moves the commit index to make that entry ready to be applied. In a single-node cluster, the majority is always that single node. Once an entry is marked as committed, the apply process may, at any time, but normally within a few milliseconds, apply the entry to the state machine, in our case the actual event store.
So if your AS has ~120K events committed but not applied, that means they are in the replication logs but not yet in the event store.
This has two effects:

  • AS complains that apply is behind, it normally does that after a very short time, to make the operator aware that something is off.
  • These events are not yet readable, as reads go through the event store and not the replication logs (eventual consistency guarantees)

In a production system, we would check what’s wrong with that node (e.g. running out of disk space), either resolve the issue or take the node out of the cluster and let it join again, without downtime for the connected applications in most cases. As this is not a multi-node environment, your best bet is to restore AS from a backup to a healthy state.
Mind you, the ~120k events are in the replication logs and not in the event store. So if they have not been backed up, they probably will be lost when the backup is restored.

Kind regards,
Marco

Hi,
In a meeting with the AxonServer team, we briefly discussed this topic.
There is a scenario that might be important to check here:
AS 2025.2.1 and 2024.2.16 include a fix for a bug in a third-party dependency, that might trigger comparable behavior.

Therefore, a few questions:

  • Are you running on a version prior to 2025.2.1 and 2024.2.16 or have upgraded from one before that very recently?
  • Have these 120k events been readable at any point?

The reason I’m asking is that the affected versions did, in rare cases, reset the raft state to a previous point in time due to a bug in a third party library. In that case, it would think it is 120K messages behind but in reality the events have already been correctly applied to the event store and the only thing left to do in a single-node cluster in a non-prod env would be to stop the node, delete everything except the event store files and recreate the single-node cluster, rediscovering the existing data.

Hi Marco,

I am running on 2025.2.0 and I upgraded from a 2024.x.x a few weeks ago.
The 120k events are all readable.

I took the latest backup I had and configured an AS locally with the following config:

  axonserver:
    restart: unless-stopped
    image: "axoniq/axonserver:2025.2.5"
    hostname: axonserver
    environment:
      - axoniq_console_authentication=redacted
    ports:
      - "8024:8024"
      - "8124:8124"
    volumes:
      - ./docker-data/axon/axon_data:/axonserver/data
      - ./docker-data/axon/axon_logs:/axonserver/logs
      - ./docker-data/axon/axon_events:/axonserver/events

I copied the file into axon_(data,logs,events) and upgraded the server to 2025.2.5.

On start, the same raft is behind issue.

I stopped the server, deleted everything in ./docker-data/axon/axon_data then started AS again.

Initially I had no message from the leader error, then it got back to green, and the issue seems to be gone. No events were lost, as the event count is the expected event count from the backup.

Thanks for your help!

1 Like