Publish multiple versions of events

Hello guys!

Would like to request for community help.

I’m new is this field and thinking about using Event Sourcing + CQRS + Log-based architecture in our platform and some aspects are not clear to me.

One of our goals is to reduce dependencies between different components. This means that, among others, we need to find a way NOT to update all event consumers (projections) together with event emitters as we can have hundreds of them. Some people say that one of the patterns to achieve this is to publish several versions of each event, so that each client could process this event. We can publish both current and previous version of event. Or event last three versions of event. So, old clients have a time window for upgrade. It seems reasonable, but I could not find anything about support of this pattern in the docs.

So, the questions is: is it possible to publish two (three and so on) versions of the same message? Actually, we don’t need them in the event storage, but in The Log, that is, in Kafka. We consider using Kafka publisher for this.

Also, it is interesting if there are other patterns to get a possibility for consumer upgrade lag.

Thank you,
Alexey

Hey Alexey,

Your question raises a lot of red flags… Why are there so many breaking changes in published events? Why are there so much dependencies between your components? Have you considered anti-corruption layers between your bounded contexts?

I would generally recommend against publishing multiple versions of an event as it can increase the complexity: You have to deal with events that may or may not be duplicates (e.g.: you have to deal with old events published without new events, old events published with new events, and new events published without old events… with maybe concurrent nodes processing different versions of the event.)

I upgrade my events using this workflow:

  1. Design the new event version
  2. Create upcasters that upcast the old event version to the new event version for all event consumers
  3. When all event consumers can process the new event version, convert the event producer to publish the new event version

This ensures there’s no duplicate processing of events and it supports blue-green deployments… However it doesn’t sound like this is a viable solution in your context.

Hi Wayne!

Thank you for your response! I’m just trying to get to the subject and so many questions are coming into my mind.

This sound reasonable if you think about one application or some bounded context. But I’m thinking about event streaming as an enterprise wide architecture concept. And in this case some aspects start looking different. At least, this is my current understanding, which is, of course, very incomplete. About subjects touched, what comes to my mind from the event streaming context:

  1. Dependencies… In the event streaming, if we publish an event, we consider it to be consumed by someone. More consumers we have for it, more value we get from the event as each consumer is a business function. And each consumer is an additional dependency. Seems to be inevitable… And this is read flag for me too. We need practical ways to control and mitigate this.
  2. Many versions of events. Two points here:
  3. Each consumer can handle events of those versions it currently supports and ignore others. If it needs new data, it has reason to upgrade in sing with event source. If it doesn’t, it can be more relaxed. We cannot push all consumers to change at the same moment. They are from different LOBs and developed by different teams with their own release plans.
  4. To simplify life of consumers, we could actually pack different versions of the event in the same message. So, consumer could look inside, check metadata and pick up right version.1. Anti-corruption:
  5. It is OK if this is temporary solution. But it this case it is needed all the time as both systems (consumer and producer) need to evolve. It the case of event streams they should be highly decoupled. Consider, for example anti-fraud or real-time marketing systems which listen corporate event streams. Event sources probably couldn’t even imagine such apps will ever consume their events. And, of course, source and consumer should evolve as ndependent of each other, as possible. So, there always be some API (events) mismatch which needs to be adapted.
  6. So, anti-corruption layer will be one more ‘live’ component which will change when any of the two systems change. So, in this case it will be more a problem than a solution.
  7. As in the enterprise we have lots of independently evolving parts, then we will need many such components.
  8. It also seems to me that multiple versions of the same event is just another way to implement anti-corruption layer. We can do these either:
  9. on the event source layer using framework support — the easiest way.
  10. event stream application which will down-cast events and, probably, re-pack them. This is less reliable as down-cast logic, if any, should be co-located with the up-casting logic.
  11. Some other way

It might be that using event sourcing in the context of event streaming is not a right way. Otherwise, framework would already recognise the problem. Or there are other, better solutions to this problem. What do you think about this?

Alexey

Hello Alexy,

I’d like to sparke some of my own thougts about your issue. Keep in mind that as you, I’m new to the concepts myself.

To start off i’d like to point out that events are really important, just as important as aggregates, the business logic etc. They drive everything so once the event is described it ideally should not be modified, it’s modification changes description of some part of the system.

As Wayne proposed, the migration between two versions of an event can be managed through introducing upcaster on the consumer side.

It basically ensures that this consumer is going to be able to handle new events.
For the time of your implementation of new event handler on consumer side, your consumer will not be able to process new type of an event.
Therefore when upstream service introduces new data in the event, this data will not be available for the downstream service, only after you implement new handler for the event you will be able to replay the events, upcast old ones and process new ones to project some data, do new business logic.

The dependencies.
I’m not so sure as to why you are scared of them.

The bounded context that publishes the event does not care what the outside consumer is.
Instead of assuming that an event will be consumed by someone from the publisher perspective, I tend to think more on the customer side of things, “What i need this to fulfill my responsibilities”

The anti corruption layer.
It is a generic solution for things like that. When you introduce consumer of events, a microservice or anything really, it’s logic works with the events of your system until two things happen:

  1. Upstream drastically changes its logic and it’s event publication (behavior)

This scenario ideally should not happen, because it will also drastically change the implementation of the downstream service.

The work around on that is basically the same work as you did when you firstly introduced this downstream service, you assumed the logic of upstream and created logic accordingly, you need to re-do that.

  1. Upstream introduces new event (behavior)

When brand new event is introduced by upstream you basically can ignore it, if it does not affect responsibilities of downstream service, if it does, you need to implement appropriate behavior and possibly replay some parts from the event stream.

Upstream changes event (data)
This is basically it, isn’t it? When new event version is introduced, it means that some of the fields has been deleted, added, modified (name).

It should not really have anything to do with the intent of such event.
So the problem for downstream is that it does not recognize any of the data from this event, so once the upstream starts publishing such events the downstream logic will be halted at exactly the point where it handles the old event version.
This is where the upcaster comes in, it is not a simple mapper from v1->v2 though, here’s what i think about it.

When upcaster is introduced to downstream service, the downstream team must need to think for a while and decide what data should be put where there was no data before?

What are implications of this data? Can we do something cool with it? Most of the times, since the downstream was not designed to work with new data, it can be just blank, but you must consider the default if you want to do something with it properly.
Keep in mind that when upstream event changes its schema, both, the upstream and downsteam do not know what to do with previous events, and they should both decide what to do inside their own boundary.

TL;DR

What Wayne proposed is the solution i’d go with. You need to threat the events with special care, they are the source of Everything.
The dependencies should be thought of as ‘what i need’ instead of ‘who needs me’, the transparency in axon really helps here. The consumers are not dependencies, the Publisher of Consumer is the dependency on Consumer.
When you have let’s say 5 services that ought to work alone, separate teams, repos whatever. I think that ideally the downstreams should care about what’s going on with the upstream not the other way around, but you know different teams, different midsets, different needs. But that’s a human aspect of organization, when you code i do not think that you should care about what your downstream is doing.

//EDIT

Axon identifies events by their class names, so i think that when you say, you would like to publish two versions of an event and decide what should handle which version, you could simply introduce Event_v1, Event_v2 semantics instead putting things inside payload itself. But also, the axon supports custom metadata values and you can fetch them whenever you handle an event, but that i guess would not work if you change the schema of event (serialization issues).

I’ve read somewhere that people use the Event_vX naming but it can get quite messy over time (code folding useful as never before here). I tend to keep my events in the api jars alongside with the upcasters, so the downstream can see what has changed in event.

Hello Robert!

Thank you for your thoughts! Interesting. Although, I wouldn’t sat, that I completely agree with everything :slight_smile:

  1. I’m still not sure that it is good to make downstream (lots of them) to change each time we change upstream (ok, for each incompatible change). Good practical solution should follow ’to sync when it makes sense’ principle. You think, if you are immediately need new data from the new event. If not, you are happy with old dataset. For you event change is formal, not semantic. You need to upgrade at some moment, of course. You put this task into backlog and you may have couple sprints to do that. It doesn’t mean, that you do not need to kwon about event changes ahead. It means, that you are not required by architecture to respond immediately and regardless of your own plans.
  2. It may work if you have couple services, but if you scale this to corporate level, then it will be extremely difficult to sync tens of components and teams. Imagine a system, which consumes events from 50 different publishers (anti-fraud, for example). Management of such a project would be a nightmare.

At the same time, I agree that each change should be considered in isolation at both sides. Sometimes, synchronised changes are needed. For example, if source system changed so much, that it just cannot provide old data set anymore.

Probably, my problem is that, due to lack of experience with this architecture, I overestimate practical frequency and impact of such changes. May be, relaxed rules are good in practice.

Alexey

I would be very concerned with allowing so many external customers interface directly with the event stream in the way you describe. We’re not all-knowing; we make mistakes and we cannot predict the future. Having that many event consumers would prevent the system from evolving from fear of breaking some unknown downstream system.

Have you considered projections? Projections take a stream of events and stores them into a purpose-built read-model. With query-update-emitters, you can even subscribe to query-updates (e.g.: events for the projection) as the projection is updated. This effectively creates an anti-corruption layer that allows you to evolve your system while still maintaining backwards compatibility with consumers.

Given how critical / core this is to your business and the consequences of getting it wrong, I would definitely consider hiring a consultant who has built these types of systems before.

Hello guys.

I really appreciate Alexy that you do not aggree with some of the thoughts, because well, i do not agree with them if i think more broadly. It is really complex problem, im trying to find out patterns that could work out for my domain, but as you say sometimes it is not efficient to do it in certain ways. I really hope that both of us will find clean solution to our problems.

To Wayne about the projection as anti corruption layer, im not sure as to how would it affect the behavior overall, truth being told that the projections can be way more elastic than events because they can be replayed and updated as we want, which brings this kinda interesting thought.

Considering that upstream changes alot, instead of using upcaster we could convert the events from upstream to the read-model and read model would emit the updates which would be picked by downstream to then emit domain events?

For me projections are really the things that we should use only for displaying the data to the end users, they should not drive the behavior.

Cheers

Besides event driven architecture:

If classes (e.g. an event) contain many properties that are needed for different reasons (no single responsibility)
and these may change frequently (unstable),
and they are shared and used all over the place,
then something might be wrong. :slight_smile:

Is there any possibility to „evolve“ without changing existing things (e.g. events)?
I mean, is it possible to extend the system by defining new events, new classes, further interface implementations
without changing existing events (open closed principle)?
Even if it does not make sense at first sight: An experiment would be to implement a new thing nearly without touching anything existing.

If different teams work on different things: Could it be, that these are different „bounded contexts“?
Do these teams have their own language and use words differently than other teams?
„AccountCreated“ does mean something else, if you work for a bank or if you work for an online shop.

As soon, as these things apply, you should not use events 1 by 1. They need to get translated. This is done by domain services, anti corruption layer, ports and adapters or whatever architecture pattern you might use for this challenge.
And yes, this is a lot of work. It means mapping. A lot of mapping. But it might help.

Hello guys!

Slowly, but picture becomes a bit clearer during this conversation. Thank you for sharing your thoughts!

How I see this now.

I think that my problem is that I’m trying apply approach that is application scope by its nature, to a wider context. Wider context has own specifics which requires different tools.

First, if we talk about one single application following micro services approach, then I tend to agree with your points. A group of services designed to work together to solve some business task, which is a definition of microservices based application, assumes some dependency among those components. If application is well designed, changes inside the app are minimised and some of the changes are encapsulated inside responsible services, not touching others. Also, compatible changes allow to reduce the need to sync releases. Even if we need sync, then it is also not a big problem as those services are part of the same application, should evolve together and in some cases, are developed by the same team. This is a context Axon framework is specifically designed for.

Second, if we consider wider scope still following micro service approach, then it would be dangerous to see it as a homogeneous set of services. This is unreliable and unmanageable. Wider scope can be modeled as a set of loosely coupled microservice based application. Coupling among different apps must be much less compared to coupling among services within one such application. This allows applications to evolve with their own pace (to some degree, of course).

This makes me think about the following principles if we consider Log-based application:

  1. Each application may need they own local micro-log which can be used for application specific tasks (projections). This can be shared event storage or Kafka.
  2. Events, that are internal to specific application, probably, not the best candidates for publishing in the corporate Log.
  3. First, they may be too low-level for wider use
  4. They may depend on implementation details of this app. If we publish them as they are then we are not free anymore to change our app design as this will cause frequent problems on consumer side. We cannot evolve independently.
  5. They may change often1. Some kind of message translation is needed to produce events for corporate Log based on application local events. Both to fix granularity and isolate changes. This is not task of Axon as this is outside of one specific service.
  6. As I understand, this message translator is not formally anti-corruption layer as it is not for specific consumer.
  7. Message translator:
  8. Allows applications to evolve;
  9. Protects wider context from frequent application scope changes;
  10. Makes wider scope consumers sync only when source application changes need to be propagated to the corporate level and they are not compatible.
  11. Here where we can implement multi version messages, if needed, to further reduce need for urgent syncs.1. Most likely, it must be considered as part of application interface and supported by application team.
  12. It seems that most appropriate implementation of such translator component would be steaming app, which translates (filters, enriches, transforms) events from the application local log to the corporate Log.

Of course, this is not polished yet.

What do you think?

Alexey

Hi all,

Nice summarization Alexey, I want to be specific on a single point though.
I wouldn’t point towards a log-based application directly, as that’s focusing to much on a specific type of implementation if you ask me.

I would more so take the route Johnny suggested, being the notion of distinct Bounded Contexts as ‘the wider scope’.

Within a given Bounded Context all the contained components should speak the same language.
As soon as you start crossing Bounded Contexts, you should share consciously, likely with a different language.

The language should be viewed was the messages you have defined, as the messages in essence are your API.
That thus entails your Commands (and their potential response), the Events (with versioning in mind) and your Queries with their Query Responses.

Being smart on a multi-context level is by the way one of the strong suites of Axon Server.
How you deal with sharing between the contexts can however covered by a multitude of patterns that Johnny also highlights upon.

In essence, I would suggest reading up on Bounded Contexts to strengthen your ideas in this area.
Using the regular language in this respect helps everybody to get to the same understanding I think.

That’s my two cents to the situation.

Cheers,
Steven

Hello!

Sounds reasonable.

We’ve decided not to force building this architecture right from the first steps. I see lot’s of dark corners here. Instead, we are going through more gradual path, with bounded contexts in mind. Applying more and more principles as we better understand them and their mapping to our solution. Fortunately, our initial phase is simple. Good playground. In particular, I agree, I need better understanding of practical aspects of bounded contexts. Will work on this.

Guys, want to thank all of you for help. It was helpful. I see right direction now and have some principles and ideas in my head.

Alexey

Hi Alexey,

Great to hear that!
Hope to chat with you again on this forum somewhere in the future.

Cheers,

Steven van Beelen

Axon Framework Lead Developer

AxonIQ
Axon in Action Award 2019 - Nominate your project