QueryBus problem: Received Rst Stream

Hi all,

we have a setup with a REST API running queries through the Axon QueryBus. The query permits setting the maximum number of results to return. Now we have noticed that when the query returns a large number of results (the limit is somewhere between 1000 and 1500 results in our case), we get an exception from Axon and the REST controller returns no result at all (we get a 504 Gateway Timeout after 30 seconds due our configuration). The log doesn’t say much, except:

(shortly after the request is started):

2019-06-25 13:05:46.290 WARN 1 — [ult-executor-98] o.a.a.c.query.AxonServerQueryBus : Received error from server: CANCELLED: HTTP/2 error code: CANCEL
Received Rst Stream
2019-06-25 13:05:46.290 INFO 1 — [ult-executor-98] o.a.a.c.query.AxonServerQueryBus : Creating new subscriber

(a long while later):

2019-06-25 13:20:20.181 ERROR 1 — [io-8080-exec-15] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.CompletionException: org.axonframework.axonserver.connector.query.AxonServerQueryDispatchException: No result from query executor] with root cause

org.axonframework.axonserver.connector.query.AxonServerQueryDispatchException: No result from query executor
at org.axonframework.axonserver.connector.query.AxonServerQueryBus$1.onCompleted(AxonServerQueryBus.java:197) ~[axon-server-connector-4.0.jar!/:4.0]
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:418) ~[grpc-stub-1.13.1.jar!/:1.13.1]
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:403) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.13.1.jar!/:1.13.1]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) ~[grpc-core-1.13.1.jar!/:1.13.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_212]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]

Any idea what could cause such an error or how we could get more detailed information about the problem?

On the Axon Server, we only get error messages like this, but after googling that message it seems to be something happening in netty after the RST is sent:

2019-06-25 13:25:37.260 WARN 9 — [-worker-ELG-3-2] io.grpc.netty.NettyServerHandler : Stream Error

io.netty.handler.codec.http2.Http2Exception$StreamException: Received DATA frame for an unknown stream 35223
at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:129) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.shouldIgnoreHeadersOrDataFrame(DefaultHttp2ConnectionDecoder.java:535) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:187) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onDataRead(Http2InboundFrameLogger.java:48) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readDataFrame(DefaultHttp2FrameReader.java:421) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:251) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:118) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:390) ~[netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:450) [netty-codec-http2-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) [netty-codec-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) [netty-codec-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) [netty-codec-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) [netty-common-4.1.24.Final.jar!/:4.1.24.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.24.Final.jar!/:4.1.24.Final]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]

Thanks
Lars

Hi Lars,
Based on the information in the logs AxonServer is closing the query before it has sent a response to your application. This can happen when a query takes too long (default timeout is set to 5 minutes). In this case you should see a message in the AxonServer logs like “Found XXX waiting queries to delete” (same time as the client log message).

Is the query handler in the same application? If it is in another application, does this have any log messages?

Regards,
Marc

Hi Marc,

thanks for the fast reply! The query handler is in the same application and I see no further log messages. The funny thing is that the first error message (CANCELLED: HTTP/2 error code: CANCEL) is displayed only a few seconds after the query is started, so this shouldn’t be a timeout problem originally. My guess is that the CANCEL error is somehow swallowed but nothing further happens, so after 5 minutes the timeout kicks in. I can confirm that there are such messages in the AxonServer log:

2019-06-25 13:30:20.177 WARN 9 — [MessageBroker-1] i.a.axonserver.message.query.QueryCache : Found 2 waiting queries to delete
2019-06-25 13:35:20.178 WARN 9 — [MessageBroker-1] i.a.axonserver.message.query.QueryCache : Found 1 waiting queries to delete
2019-06-25 13:40:20.178 WARN 9 — [MessageBroker-1] i.a.axonserver.message.query.QueryCache : Found 1 waiting queries to delete

Is there maybe any specific log category that I can set to DEBUG to find out more?

Regards,
Lars

Hi Lars,

The issue that you are seeing is most likely that the size of the response message of the query is larger than the default max message size for gRPC. The default maximum message size for a gRPC message is 4194304 bytes, so when you are returning a large response you may exceed this limit.

To increase the max message size you can set the property “axoniq.axonserver.max-message-size” in the AxonServer configuration and set “axon.axonserver.max-message-size” in the client configuration.

I hope this resolves your issue.

Regards,
Marc

Hi Marc,

thanks, I will definitely try this! May I suggest adding some logging into Axon Server and/or Axon Framework that points users to this problem when they run into it?

Regards
Lars

Hi Marc,

I have increased the two settings to 16MiB and the problem is fixed. Thanks again!

Incidentally, when the size is increased on the axon server but not on the client, there is a helpful error message, it’s only missing when the setting on the axon server is too small.

Regards
Lars