Avro serialization - working example

Hello fellow axon users,

for the last month I have been working on my idea of a configurable avro serializer for the axon framework. I had some drawbacks and am far from “feature complete”, but I reached a point where I can share a working example and I’d love to hear your feedback to decide if this is something to explore further and make it an axon framework extension.

Apache Avro is a schema based de/serilization framework available for several programming languages. It is based on json, so an avro message can be treated as a json message. The benefit of avro, and this is what in my opinion makes it a great candidate for axon event store serialization, is that it supports the single object encoding which is a pure binary representation of a message that contains a reference to the schema and the bytes of the message payload with a very low memory footprint.
With regard to event processing/upcasting, we can evaluate the schema compatibility of the writer (publisher) and the reader (processor) of the event. The overall theory might take to long to discuss up front, but I’d love to go into detail if you are interested.

So let’s move to the working example. I created a repository GitHub - holixon/axon-avro-serializer: Provide support for axon event serialization via apache avro. that implements the Serializer interface. It provides an example using a “classic” bank account application running on spring boot and connecting to the axon server and the apicurio open source scheme registry.

The idea is that I define my events schemas up front, register them on apicurio and generate classes using the avro maven plugin (Gradle also available). Then I create a axon application with a configured serializer that leverages the schema registry for deserialization of events by reading the schema ID from the single object encoded bytes and using the schema information to refer to the generated class for that schema.

While working on this, I discovered that to support various registries (apicurio and confluent being the main targets) I need an abstraction layer for the java/avro stuff that is not really related to axon. So in addition to the avro serializer, I created the avro-adapter repository, which itself is inspired by the darwin library . The adapter lib provides a generic api for avro deserialization and a schema registry interface, so the axon serializer will remain agnostic of the concrete implementations and tooling used.

This code is not yet published to maven central, so please be aware, that when you want to run the example application, you will have to clone the avro-adapter repository and install it to your .m2/repository before the axon-aero-serializer extension is able to use it.

If you (I am looking at @Steven_van_Beelen and @allardbz :slight_smile: ) are interested I would love to schedule a web session where I can give you a demo …

Why am I telling you all this? Well, ever since I started working with axon, there was the “xstream” is default, “jackson” is better but “avro would probably best discussion” when it came to event serialization, that kinda froze when Allard told me “we tried it and it didn’t work”. Well, I believe I am onto something here, but before I spend more effort, I really need community and Axoniq feedback. Maybe I am just at a point you already reached and I will fall if I take one more step … or I found a way around the obstacles you already encountered and this could really become something the community or Axoniq itself might want to investigate further. Only one way to find out: Try it, and come back to me.

Really looking forward to hearing from you, so excited
Jan

8 Likes

This looks really good, Jan.

Not sure if that was me, or maybe it was me, but talking about another serializer. I did some proof of concept with Protobuf at the time, which I like because of the formal (external) schema definition. Technically, it actually works just perfectly. The only issue was that the classes generated were not very “constructor”-friendly. In the case of events, generating them would invoke calling a builder method, then set all the properties (with more builders when there are nested objects) and then a “build()” call. More or less what creating a Kotlin object instance would be :wink: .

I noticed in the sources that you still rely on Jackson for the serialization of metadata. I suppose that’s only meant for the “complex objects” that are potentially put inside the metadata? (Which is something we generally recommend against but decided to “allow” for historical reasons).

Hi Allard,

I asked the same question regarding the Json in MetaData. In fact the structure of the metadata is a map<String, Object> and it is possible to implement the GenericRecord format for it (it is a map with any arbitrary keys). This way we would skip the dependency to Jackson entirely.

  • More a general question, do you think we should continue work on this and make it shiny (for now it is a working PoC)?
  • We would build more examples and write some articles and docs on that.
  • Do you think we can develop it as a community extension?
  • Regarding the intermediate representation, it is possible to convert the bytes into a generic record (GenericRecord (Apache Avro Java 1.7.6 API)) representation and do all the upcasting on that level. Do you think it is a good abstraction for that?
  • We are reasoning about schema registries. Apart from Apicurio and some other implementations an interesting idea would be to deploy a registry thsat uses Axon Server itself to store schemas…
  • What are generally features you would like to see, in order to get convinced? More examples or more docs or whatever?

Cheers,

Simon

Awesome work; thanks for the write-up Jan!

I’d love to have a session like this.
That would allow an efficient discussion on the matter to deduce how this might be carried further.
Let’s start a mail conversation to actually schedule such a meeting if that suits you, Jan. Assuming @allardbz wants to be there too, so would be good to add the both of us.

1 Like

Thanks for the reply, invitation accepted.

To raise some open issues (summary from above):

  • Metadata serialization via Jackson: I was lazy … it is easily overcome by shipping a “map-style” schema with the extension, as Simon said
  • Poor handling of avro generated classes via Builder: Yes, it is not that great … the default maven/gradle generator is quite old. But it does the job and does not feel very different compared to the lombok/value/builder style. Of course kotlin data classes are nicer, but there are already some projects addressing this issue … I would be fine living with the “limitations”
  • How can we use the axon server web app search to work with avro serialization?
  • Implement upcasting based on GenericData, explore dynamic upcasting via schema defaults
  • support more than one schema registry provider