UUID as identifier

As of now java.util.UUID is used for all kinds of identifiers in Axon.
Nothing wrong with this, but it makes some things harder, at least in
my code.

Before I started using Axon a typical constructor would look like:

public Order(
    final OrderId id,
    final ShareId shareId,
    final ParticipantId participantId,
    final LocalDateTime timestamp,
    final Side side,
    final Money price,
    final Volume volume)
)

Now (Order, Share and Participant being agregate roots) it looks like
this:

public Order(
    final UUID id,
    final UUID shareId,
    final UUID participantId,
    final LocalDateTime timestamp,
    final Side side,
    final Money price,
    final Volume volume)
)

This is because java.util.UUID is final and OrderId cannot inherit it,
which would otherwise solve the problem.

Wouldn't it be possible to have this instead?

// In Axon
////////////////

/**
* Use this instead of UUID everywhere
*/
public interface UniqueIdentifier {
    // Nothing needed here. equals, hashcode and toString inherited
from java.lang.Object.
}

public class UUIDIdentifier {
    private UUID identifier;

    public UUIDIdentifier(UUID identifier) {
        this.identifier = identifier;
    }

    @Override
    public boolean equals(Object o) {
        return this.identifier.equals(o);
    }

    @Override
    public int hashcode() {
        return this.identifier.hashcode();
    }

    @Override
    public String toString() {
        return this.identifier.toString();
    }
}

// In my project
///////////////////
public class OrderBookId extends UUIDIdentifier {
    public OrderBookId(UUID identifier) {
        super(identifier);
    }
}

A better example of how code gets easier to read:

With UUID:

private final Map<UUID, Volume> orderVolume = Maps.newHashMap();

With custom ids

private final Map<OrderBokId, Volume> orderVolume = Maps.newHashMap();

Hi Per,

using ID’s expressed in the Ubiquitous Language is something I would really recommend. In Axon, I have chosen to use the UUID for several reasons. Before I explain what these are, I would like to say that the fact that UUID is final should not withhold you from using it.

In general, it is considered good practice to favor composition over inheritance. That means your OrderId would look as follows:

public class OrderId {

private UUID identifier; // note that this could also be a String value, which is more Hibernate compatible

public static OrderId fromUUID(UUID identifier) {
return new OrderId(identifier);
}

private OrderId(UUID identifier) {

}

}

Hi Per,

[my previous mail was sent out accidentally before I finished typing it]

using ID’s expressed in the Ubiquitous Language is something I would really recommend. In Axon, I have chosen to use the UUID for several reasons. Before I explain what these are, I would like to say that the fact that UUID is final should not withhold you from using it.

In general, it is considered good practice to favor composition over inheritance. That means your OrderId would look as follows:

public class OrderId {

private final UUID identifier; // note that this could also be a String value, which is more Hibernate compatible

public static OrderId fromUUID(UUID identifier) {
return new OrderId(identifier);
}

private OrderId(UUID identifier) {
this.identifier = identifier
}

public UUID toUUID() {
return identifier;
}

@Override
public String toString() {
return identifier.toString();
}
}

In your commands, you would use OrderId to explicitly distinguish between ID’s of different types of aggregates. When loading from a repository, you could do:

repository.load(orderId.toUUID(), expectedVersion) // or without the second parameter, if you’re on Axon 0.5

Another nice approach would be to also add the version attribute, and make it a versioned identifier. This returned identifier would normally be given to a UI through a query. This UI doesn’t have
to bother with versions, all it has to do is pass the versioned ID it got from the query back into the command, and concurrent modification are automatically detected. (Note: concurrent modification detection is available since 0.6)

Note that an aggregate may always be generated with an existing identifier. There is nothing against using “super(identifier)” in the constructor of an aggregate. Just make sure that there is a clear difference between the constructor in your implementation used to reconstruct an existing aggreggate, and the one for creating a new one altogether.

I hope this helps you on your way.

As promised, here are some reasons for choosing UUID as primary identifier of aggregates and events:

  • UUID is a global standard for unique identifiers
  • UUID’s support scalability. Sequence numbers withold scalabilty, since machines need to communicate with eachother to prevent them from generating the same ID’s.
  • UUID’s have different formats, you can convert almost any type of identifier into a UUID. For example, if you use a sequence number for invoice ID’s (which is a legal obligation in many countries),
    you could generate that as a long, and use a version 5 UUID, which uses SHA-1 to compute a UUID for any input.

And one could think of more reasons, of course.

Cheers,

Allard

Hi!

Thanks for your explanation. I think I was still stuck in the old way
of thinking there. Of course I can use OrderId everywhere else and
pass it in through commands and out again through events. I only have
to convert (extract) to UUID in one place in the command handler when
loading the aggregate. That works fine for me. I've been coding some
more the last day and all of this has started to sink in pretty well
now. I really like this way of structuring different "black boxes"
that communicate through events.

An other thing about the ids that I'm thinking about:

I don't want to pass in an id in a CreateFooCommand since I don't want
the client to pick ids, I generate them in the command handler and
pass them out in a FooCreatedEvent. This works fine in the application
since you always have the id as a reference when performing a new
action on the Foo. It gets harder when writing tests where I want to
create one Foo with command 1 and then run command 2 which needs the
id of the newly created Foo. What's the suggested way here?

As far as I can tell, there are three options for ID generation and feedback:

  1. Client generates a UUID. Since UUID.random() will generate virtually always a unique identifier, this could be an acceptable solution. Though I can imagine that for some applications, even a 1x10E-16 chance of a duplicate is unacceptable. The server could generate an exception and notify the client of a ID conflict to solve that problem.

  2. Client generates a CorrelationID, which it sends with the command. All events generated as a direct cause of that command would have that correlationID in them. The client can detect events generated with that CorralationID and extract the generated aggregate ID from that event. In some cases, you might even be able to use a business ID to correlate commands and events.

  3. Clients sends a plain command using the commandBus.dispatch(command, callback) method. The handler creates an aggregate and returns the ID of the newly created aggregate. The client will be notified when command handling is complete, and received the ID in the callback. If the command handler is synchronous, the callback is called before the dispatch methods returns. But it’s bad practice to rely on that. If you need the command to finish before continuing, consider using the FutureCallback. It act as a java.util.concurrent.Future, and allows you to explicitly wait for a command to be processed.

I believe that none of these three options is better than the others. It mostly a matter of taste.

Cheers,

Allard

Thanks for the summary. I think I'll go with (2). For testing purposes
I can always monitor the event stream for lets say a "name" property
and have a map from name to aggregate id.
There are cases where real clients will do the same thing with a
"referenceId" field that only has to be unique per client.

(3) will not be possible for me since commands are sent through a
distributed queue and I don't think I will bother with a callback
mechanism there.

How about Generics? Maybe it's a good idea to create a generic version
or a wrapper around the Java UUID class?

Then it should be possible to create a type-safe version of the ID:

public Order(
    final UUID<Order> id,
    :
)

Maybe it's in general a good idea to create an interface for the UUID
to open up the possibility to choose other (possibly faster) UUID
implementations?
(See http://johannburkard.de/blog/programming/java/Java-UUID-generators-compared.html)

It could be something like this:

public interface AggregateId<TYPE extends AggregateRoot> {
    public String toString();
}

Of course you'd need a accompanying factory:

public interface AggregateIdFactory {
    public <TYPE extends AggregateRoot> AggregateId<TYPE>
create(Class<TYPE> aggregateType);
    public <TYPE extends AggregateRoot> AggregateId<TYPE>
fromString(Class<TYPE> aggregateType, String aggregateIdString);
}

Should be relatively easy to replace the existing java.util.UUID with
something like that.

Only thing I found is that several types like AggregateRoot are not
generified!? This would be a problem as UUID needs a concrete type.
Was there any concrete reason not to use generics?

Cheers,
Michael

Hi Michael,

at the very beginning of Axon (when it wasn’t even named yet), I spent a lot of time discussing with colleagues about the identifier to use. I noticed that the UUID is a very scalability-safe identifier to use and uniqueness is (almost) guaranteed, depending on the type/version you use.

The idea to abstract the actual identifier used has crossed my mind several times. The thing holding me back from implementing this is the extra layer of abstraction. I don’t think it will make it much easier to those not in need of another type of identifier. And besides that, the UUID is used as a unique identifier in frameworks like Spring as well. Spring Integration is littered with UUIDs to uniquely identify messages.

The UUID generation performance is mainly related to the UUID.random() method. Other frameworks might be able to generate UUIDs faster, but do they have the same randomness to them? If so, converting them to java 5 UUID’s should be a walk in the park.

I’m very reluctant if it comes to adding an extra layer of abstraction to the identifier of aggregates, as I believe it will introduce unneeded complexity for the majority of users.

What do you mean with “Was there any concrete reason not to use generics?” What did I need to generify in the aggregate root?

Cheers,

Allard

I'm very reluctant if it comes to adding an extra layer of abstraction to
the identifier of aggregates, as I believe it will introduce unneeded
complexity for the majority of users.

I can understand the reasons why you choose the UUID - But moving
towards a framework may raise different needs for different users.
Your choice may not be the best choice for other users (as you can see
with Per's request :wink: Maybe someone prefers to have a Long value as
ID for the aggregates because it's absoluteley essential not to get a
duplicate value and it's generated by some external kind of process.
This would be easy with an interface. Also type-safety is an important
argument for not using solely the java.util.UUID. Finally I don't
think it really adds complexity to the majority of users - The only
thing that changes for them is that they have to use the new ID type
instead of a UUID. The implementation details are hidden and the first
implementation would be of course a java.util.UUID based one to
maintain backward compatibility.

What do you mean with "Was there any concrete reason not to use generics?"
What did I need to generify in the aggregate root?

Only if you have this:

public interface AggregateRoot<TYPE> {
    AggregateId<TYPE> getIdentifier();
    :
}

It's possible to implement this:

public abstract class AbstractAggregateRoot<TYPE> implements
AggregateRoot<TYPE> {
    private final AggregateId<TYPE> identifier;
    :
}

And Per can do this:

public class Contact extends AbstractAggregateRoot<Contact> {
   :
}

So the interfaces and the abstract classes need to be generic.

Cheers,
Michael

Hi Michael,

there’s nothing like a good discussion. I have to admit that I am receptive to your comments. I initially had a different structure of identifiers and generics in mind, however.

I was thinking of: class AbstractAggregateRoot, where ID is the type of identifier to use. For example: AbstractAggregateRoot<AggregateIdentifier>, or even more explicit: AbstractAggregateRoot.

But probably the one you suggested is simpler and still flexible enough.

The only methods an AggregateIdentifier should have is a toString method (and a way to reconstruct it from a String, I guess). And, of course, it needs to be serializable.

De default constructor of the AbstractAggregateRoot could generate a GenericAggregateIdentifier (but the implementation is hidden behind the generic interface), which uses a randomly generated UUID as its inner value.

What do you think?

Cheers,

Allard

I created a possible interface and default implementation here:
http://www.fuin.org/files/AggregateId.zip

Hi all,

as you might have noticed, I created an issue for this one in the issue tracker. I have also started implementing it.

Now, I’ve encountered some problems, that I would like to share with you. If you have any suggestions, please let me know.

I have identifier four options:
1- Keep the UUID as primary identifier and expect framework users to wrap it into a specific identifier type if they need to (current implementation)
2- Create a type AggregateIdentifier, which is an interface, that users may extend. getAggregateIdentifier methods will return AggregateIdentifier. This solution does not use generic types.
3- Create a generically type AggregateIdentifier, where T is the Aggregate. Identifiers are always typed
4- Type the aggregate root with the type of identifier used. This allows users to create a strong typed identifier if they want to, which is returned on getAggregateIdentifier methods.

Now about the solutions:
1 and 2 are relatively simple to implement (1 being the current implementation). 2 has the advantage that you are no longer tightly bound to UUID. Although I firmly believe that a UUID is the safest to use, it allows other implementations to be used (such as the faster UUID variants Michael was talking about). Both these options require the framework user (i.e. a developer) to either override getAggregateIdentifier methods on the aggregates and events to return a more explicit AggregateIdentifier instance, or create a separate method that casts the AggregateIdentifier in the actual type expected.

3 is the option proposed by Michael, if I understood correctly. This method seems to be providing the cleanest API. However, since Aggregate Identifiers are present on both Aggregates and Events (and probably also Commands), this means that the T parameter must be visible to them. Since T is the aggregate itself, it means that the DomainEvent instances must be aware of the aggregate (implementation) they originate from. In other words, an object that is usually only needed by a CommandHandler, now needs to be visible to many more components. This means that your application’s API (the commands and events) now have a dependency on the implementation (the aggregate used to process the command). Furthermore, the massive usage of generics in nearly all components makes this solution way too intrusive. Especially for the developers that just want to stick to defaults.

4 would be an alternative solution to 3. Since an AggregateIdentifier is typically part of the API, this won’t introduce unwanted dependencies. But it does require typing the DomainEvent with the type of identifier returned by getAggregateIdentifier. It doesn’t require iffy dependencies, but it is still a very intrusive solution. Especially for the users that just want to stick to the defaults.

My feeling is that option 2 adheres best to my motto: make the default way easy, but an alternative possible. If you want to force events to return another type of identifier, just cast or wrap the identifier returned by the parent class’ method. Since it stays away from extensive generics usage, it is also a lot easier to use.

I am curious about what you think. Feel free to comment/discuss.

Cheers,

Allard

Hi Allard,

However, since Aggregate Identifiers are present on both Aggregates and
Events (and probably also Commands), this means that the T parameter
must be visible to them. Since T is the aggregate itself, it means that the
DomainEvent instances must be aware of the aggregate (implementation)
they originate from.

I think this is a strong argument against solution 3.

The solution 2 ("Create a type AggregateIdentifier, which is an
interface") seems to be the best way to open up alternative
implementations.
Here is a suggestion for an interface with a factory and a UUID
default implementation: http://www.fuin.org/files/AggregateIdV2.zip

Cheers,
Michael