AxonServer read_aggregate_events sometimes takes seconds to complete

tqwewe · May 7, 2024, 6:56pm

Hi

I’m just looking into AxonServer and Synapse for use with Rust, and I’ve run a single node cluster with synapse using the docker-compose file shown here.

After having AxonServer running, and going through the initial configuration in the web dashboard, I tried querying events from a non-existent aggregate using:

read_aggregate_events(&config, "default", "123").await?;

But strangely, it often takes 1-5 seconds to even return the empty list of events.

To debug, I put it in a loop, and timed the operation, and see an average of 50 ms, but every 10th or so request takes multiple seconds to complete.

Am I missing something here? Is this due to me running this in Docker or something else? In comparison to something like Postgres, I’d get much faster queries even in Docker. The container has plenty of GB of memory and CPU.

Thanks!

Marco_Amann · May 8, 2024, 8:44am

Hi Ari,

thanks for reporting, that’s certainly not expected!

I would like to reproduce the behavior. To make sure I understood your scenario correctly, you have a nearly empty event store in axonserver and query events for a non-existing aggregate via synapse which takes several seconds to complete in some cases?

Regards,
Marco

tqwewe · May 8, 2024, 10:02am

Hi Marco, yep thats exactly correct. The AxonServer actually contains no messages whatsoever, and is a fresh install in docker.

For more info on replicating, I’m using this docker-compose file with the axon server image being axoniq/axonserver:2024.0.1, and synapse version axonsynapse-0.10.0.jar.

The Rust code I have written is:

use std::time::{Duration, Instant};

use synapse_client::apis::{aggregate_api::read_aggregate_events, configuration::Configuration};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut configuration = Configuration::new();
    loop {
        let start = Instant::now();
        let foo = read_aggregate_events(&configuration, "default", "123").await?;
        println!("{}", start.elapsed().as_millis());
        tokio::time::sleep(Duration::from_millis(500)).await;
    }
}

With Cargo.toml:

[dependencies]
anyhow = "1.0.82"
synapse-client = { path = "../AxonIQ/axon-rust/synapse-client" } # cloned repo
tokio = { version = "1.37.0", features = ["full"] }

I run the project in release mode and get an output like:

With the occasional 1000ms+ times

Marco_Amann · May 8, 2024, 11:04am

Thanks for sharing the details!
When running with the provided code and synapse and axonserver in docker, i get the following output:

warning: `foo` (bin "foo") generated 2 warnings (run `cargo fix --bin "foo"` to apply 1 suggestion)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 8.60s
     Running `target/debug/foo`
76
21
7
13
7
7
6
7
7
6
7
6
7
...

After a few minutes i checked and saw no request times larger than the 7 or sometimes 8 ms observed after JVM warmup.

To run the containers, i used the following commands:

podman run --rm -p 8024:8024 -p 8124:8124 --name as01 --net axon --network-alias as01 axoniq/axonserver:2024.0.1

and

podman run --rm -p 8080:8080 --net axon -e synapse.serverList=as01:8124 axoniq/synapse

Could you please check using curl if you can reproduce the issue?

curl -sv 127.0.01:8080/v1/contexts/default/aggregate/3e011144-79f7-4a56-ae7a-17efb78eb829/events
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
...
>
* Request completely sent off
< HTTP/1.1 200 OK
...
<
* Connection #0 to host 127.0.0.1 left intact
{"items":[]}%

That we return an empty list with 200 is kind of special, that might change in a future version.

Do you observe the same multi-second delays using curl?

tqwewe · May 8, 2024, 11:32am

Interesting that you’re able to get such a low time of 7 ms. The lowest I’ve seen is around 30.

Could you please check using curl if you can reproduce the issue?

I’ve tested with curl with the time command prefixed, mostly getting around 34ms, but I did get one that was 1 second:

❯ time curl -sv 127.0.01:8080/v1/contexts/default/aggregate/3e011144-79f7-4a56-ae7a-17efb78eb829/events
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> GET /v1/contexts/default/aggregate/3e011144-79f7-4a56-ae7a-17efb78eb829/events HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Vary: Origin
< Vary: Access-Control-Request-Method
< Vary: Access-Control-Request-Headers
< Content-Type: application/json
< Content-Length: 12
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-Content-Type-Options: nosniff
< X-Frame-Options: DENY
< X-XSS-Protection: 1 ; mode=block
< Referrer-Policy: no-referrer
<
* Connection #0 to host 127.0.0.1 left intact
{"items":[]}
________________________________________________________
Executed in    1.00 secs      fish           external
   usr time    5.96 millis    0.21 millis    5.76 millis
   sys time    8.85 millis    1.11 millis    7.74 millis

tqwewe · May 8, 2024, 3:10pm

I’ve just tried doing the same thing with postgres, and get the following output:

It’s a little unfortunate postgres is giving such a small response time compared to AxonServer, it’s my biggest reason for not choosing AxonServer at this point, despite all the awesome features it seems like it has

Marco_Amann · May 8, 2024, 3:45pm

I just tried again with the same curl as you used and got results similar to what i observed using the rust client.

Do you use docker on windows or mac by any chance?

Regarding the results with postgres: I agree, these requests are way faster than what you observed with synapse.
Keep in mind though, that Synapse is an http service in front of AxonServer itself. I am unsure if reqwest (used by the rust synapse client) has a http connection pool, so you probably re-establish the tcp connection for each and every request. If you require more performance, you can cut out Synapse completely and use the gRPC API of AxonServer directly.

While that for sure will reduce the per-request latency, I doubt AxonServer will be quite as fast as postgres for a simple lookup.

Depending on your use case (i am curious tbh), it might be worth noting that the per-request latency should stay somewhat constant for parallel requests with both postgres and AxonServer. So if your use-case allows for parallel requests, the individual latencies will be less noticeable.

Kind regards,
Marco

Gerard · May 10, 2024, 10:18am

In addition to directly using the gRPC API, you could opt to leverage dendrite.