Using protobuf for Commands and Events

Per_Wiklander · February 4, 2011, 1:51am

I've been thinking about if it would be possible to use Google
protocol buffers (http://code.google.com/p/protobuf/) or more
specifically protostuff (http://code.google.com/p/protostuff/) to
define, code generate and serialize / deserialize commands and events
in Axon Framework.

Scroll down for my Axon Related questions...

Here is a quick description of protobuf (read more at
http://code.google.com/apis/protocolbuffers/docs/overview.html).

Messages are defined in .proto files like this (simple) example:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;
}

The protobuf compiler then takes the message definition and generates
code in your language of choice. The official protobuf compiler from
Google supports C++, Java and Python. But there is a plugin structure
to add support for anything you want.

The generated code will give you message classes (Java POJOS in our
case) that contain the needed code to serialize and deserialize
themselves to and from the binary protobuf format. The messages are
forwards and backwards compatible through the field numbers. Let's
say we wanted to add the new field "nick" to the Person message:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;
  optional string nick = 4
}

And then we don't want to handle email anymore:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string nick = 4
}

By never reassigning field numbers the generated code can handle both
old and new messages (think no need for versioning of Events).

As I said, the default protobuf serialization format is binary (this
is what Google uses internally), but with the protostuff project comes
code generation for: json (with GWT overlay types), xml and yaml (only
serialization, but perfect for making messages human readable).

So with a single message definition you could have a C++ client that
sends binary commands over a socket to your Java CQRS server that then
gives JSON events to a JavaScript client in a browser.

END OF INTRO TO PROTOBUF

So, what would it take to use it with Axon?

Commands
For the commands there's no problem, since Axon doesn't require
anything special from a command.

Events
Here I can see even more value since they need to be both serialized
to the storage and sent across the network. But this is also the part
that needs a bit of love to make it work since Axon has requirements
on the Events. A plain vanilla protobuf POJO doesn't extend anything
but by creating our own template (they use StringTemplate for the code
generation) we could make the generated classes extend DomainEvent as
required.

So now I would like to know what extra fields need to be serialized
with the event to make it work in Axon? Things like identifier,
timestamp etc.
This might be a bit tricky to get working and I don't want to force
protobuf onto anyone who doesn't need it, so my thinking is to try to
get this working with zero change to the Axon Framework code.

So... anyone else interested?

Allard · February 4, 2011, 7:47am

Hi Per,

interesting stuff. I had heard about protobuf before, but hadn’t looked into it yet. Looks very promising as a serialization mechanism for Axon.

With regards to you question about which fields in the event should be serialized, there are a few fields that need to be serialized. For all events, that is the identifier, the timestamp. For domain events, you’ll also need to serialize the aggregate identifier (which can be stored as a string - see AggregateIdentifier.asString()) and the sequence number.

Since 0.7, you can add arbitrary meta-data to any event. Does protobuf allow nesting of messages? This meta-data is built using key-value pairs. If it is possible to define that as a generic re-usable message, that would be perfect.

I am looking forward to a patch containing a protobuf EventSerializer implementation.

Cheers,

Allard

Erik_van_Oosten · February 4, 2011, 12:02pm

It seems that ‘the’ hot serialization protocol of the moment is Avro.

Regards, Erik. Op 04-02-11 08:47, Allard Buijze schreef:

Per_Wiklander · February 4, 2011, 2:19pm

Nesting of messages, oh yes it does. I'm thinking of something like
this:

//
// User defined messages
//
message UserRegistered {
  required string user_id = 1;
  required string user_name = 2;
  required string email = 3;
  required string password = 4;
}

message UserVerified {
required string user_id = 1;
required string token = 2;
}

message UserRemoved {
required string user_id = 1;
}

//
// Event container. Needed since the type of the
// serialized message is not stored. So we wrap
// our Event message in EventContainer. This allows us
// to treat the whole event stream as a stream of
// EventContainer. It also gives us a good spot to include
// the needed metadata.
//
message EventContainer {
  required int64 sequence_number = 1;
  required string aggregate_identifier = 2;
  optional int64 event_revision = 3 [default = 1];

  message EventMetaData {
    required string key = 1;
    required string value = 2;
  }

repeated EventMetaData metadata = 4;

required EventType type = 5;

  // We say here that we want to use field
  // id 100 to max for extended fields
  extensions 100 to max;
}

//
// This extends (as in adds to, not as in Java class inheritance)
// the EventContainer message with possible slots for the actual
// message.
// Doing eventContainer.getUserRegistered() would return
// null unless that is the actual type of the message, so
// we use the EventType enum value to know which message to
// expect.
//
// This part can be code generated from the user defined messages.
//
extend EventContainer {
  optional UserRegistered user_registered = 100;
  optional UserVerified user_verified = 101;
  optional UserRemoved user_removed = 102;
}

//
// An enum of available event types.
// This part can be code generated from the user defined messages.
//
enum EventType {
  USER_REGISTERED = 1;
  USER_VERIFIED = 2;
  USER_REMOVED = 3;
}

Per_Wiklander · February 4, 2011, 4:22pm

Nesting of messages, oh yes it does. I'm thinking of something like
this:

... snip ...

//
// Event container. Needed since the type of the
// serialized message is not stored. So we wrap
// our Event message in EventContainer. This allows us
// to treat the whole event stream as a stream of
// EventContainer. It also gives us a good spot to include
// the needed metadata.
//
message EventContainer {
required int64 sequence_number = 1;
required string aggregate_identifier = 2;
optional int64 event_revision = 3 [default = 1];

message EventMetaData {
required string key = 1;
required string value = 2;
}

repeated EventMetaData metadata = 4;

required EventType type = 5;

// We say here that we want to use field
// id 100 to max for extended fields
extensions 100 to max;

}

A clarification:

The posted code is a valid protobuf message definition. This will not
magically give us what we need for Axons purposes, for that we need to
hook in some extra code generation on the Java side. What it does do
is let us stream our events in binary form (or JSON if you prefer
that) to any of the 23 supported programming languages. Which is
pretty nice already IMHO.

Allard · February 6, 2011, 11:29am

Hi Per,

maybe it’s even possible to create an xstream adapter that writes to protobuf (perhaps via java beans). It shouldn’t be too hard to find a way to generically map Axon event types to the protobuf-generated Java files.

I’ll try some stuff soon. Looks interesting enough…

Cheers,

Allard