How it Works
The fundamental idea of Event Sourcing is that of ensuring every change to the state of an application is captured in an event object, and that these event objects are themselves stored in the sequence they were applied for the same lifetime as the application state itself.
Let’s consider a simple example to do with shipping notifications. In this example we have many ships on the high seas, and we need to know where they are. A simple way to do this is to have a tracking application with methods to allow us to tell when a ship arrives or leaves at a port.
Figure 1: A simple interface for tracking shipping movements.
In this case when the service is called, it finds the relevant ship and updates its location. The ship objects record the current known state of the ships.
Introducing Event Sourcing adds a step to this process. Now the service creates an event object to record the change and processes it to update the ship.
Figure 2: Using an event to capture the change.
Looking at just the processing, this is just an unnecessary level of indirection. The interesting difference is when we look at what persists in the application after a few changes. Let’s imagine some simple changes:
- The Ship ‘King Roy’ departs San Francisco
- The Ship ‘Prince Trevor’ arrives at Los Angeles
- The Ship ‘King Roy’ arrives in Hong Kong
With the basic service, we see just the final state captured by the ship objects. I’ll refer to this as the application state.
Figure 3: State after a few movements tracked by simple tracker.
With Event Sourcing we also capture each event. If we are using a persistent store the events will be persisted just the same as the ship objects are. I find it useful to say that we are persisting two different things an application state and an event log.
Figure 4: State after a few movements tracked by event sourced tracker.
The most obvious thing we’ve gained by using Event Sourcing is that we now have a log of all the changes. Not just can we see where each ship is, we can see where it’s been. However this is a small gain. We could also do this by keeping a history of past ports in the ship object, or by writing to a log file whenever a ship moves. Both of these can give us an adequate history.
The key to Event Sourcing is that we guarantee that all changes to the domain objects are initiated by the event objects. This leads to a number of facilities that can be built on top of the event log:
- Complete Rebuild: We can discard the application state completely and rebuild it by re-running the events from the event log on an empty application.
- Temporal Query: We can determine the application state at any point in time. Notionally we do this by starting with a blank state and rerunning the events up to a particular time or event. We can take this further by considering multiple time-lines (analogous to branching in a version control system).
- Event Replay: If we find a past event was incorrect, we can compute the consequences by reversing it and later events and then replaying the new event and later events. (Or indeed by throwing away the application state and replaying all events with the correct event in sequence.) The same technique can handle events received in the wrong sequence – a common problem with systems that communicate with asynchronous messaging.
A common example of an application that uses Event Sourcing is a version control system. Such a system uses temporal queries quite often. Subversion uses complete rebuilds whenever you use dump and restore to move stuff between repository files. I’m not aware of any that do event replay since they are not particularly interested in that information. Enterprise applications that use Event Sourcing are rarer, but I have seen a few applications (or parts of applications) that use it.
Application State Storage
The simplest way to think of using Event Sourcing is to calculate a requested application state by starting from a blank application state and then applying the events to reach the desired state. It’s equally simple to see why this is a slow process, particularly if there are many events.
In many applications it’s more common to request recent application states, if so a faster alternative is to store the current application state and if someone wants the special features that Event Sourcing offers then that additional capability is built on top.
Application states can be stored either in memory or on disk. Since an application state is purely derivable from the event log, you can cache it anywhere you like. A system in use during a working day could be started at the beginning of the day from an overnight snapshot and hold the current application state in memory. Should it crash it replays the events from the overnight store. At the end of the working day a new snapshot can be made of the state. New snapshots can be made at any time in parallel without bringing down the running application.
The official system of record can either be the event logs or the current application state. If the current application state is held in a database, then the event logs may only be there for audit and special processing. Alternatively the event logs can be the official record and databases can be built from them whenever needed.
Structuring the Event Handler Logic
There are a number of choices about where to put the logic for handling events. The primary choice is whether to put the logic inTransaction Scripts orDomain Model. As usualTransaction Scripts are better for simple logic and aDomain Model is better when things get more complicated.
In general I have noticed a tendency to useTransaction Scripts with applications that drive changes through events or commands. Indeed some people believe that this is a necessary way of structuring systems that are driven this way. This is, however, an illusion.
A good way to think of this is that there are two responsibilities involved. Processing domain logic is the business logic that manipulates the application. Processing selection logic is the logic that chooses which chunk of processing domain logic should run depending on the incoming event. You can combine these together, essentially this is theTransaction Script approach, but you can also separate them by putting the processing selection logic in the event processing system, and it calls a method in the domain model that contains the processing domain logic.
Once you’ve made that decision, the next is whether to put the processing selection logic in the event object itself, or have a separate event processor object. The problem with the processor is that it necessarily runs different logic depending on the type of event, which is the kind of type switch that is abhorrent to any good OOer. All things being equal you want the processing selection logic in the event itself, since that’s the thing that varies with the type of event.
Of course all things aren’t always equal. One case where having a separate processor can make sense is when the event object is aDTO which is serialized and de-serialized by some automatic means that prohibits putting code into the event. In this case you need to find selection logic for the event. My inclination would be to avoid this if at all possible, if you can’t then treat theDTO as an hidden data holder for the event and still treat the event as a regular polymorphic object. In this case it’s worth doing something moderately clever to match the serialized event DTOs to the actual events using configuration files or (better) naming conventions.
If there’s no need to reverse events, then then it’s easy to make aDomain Model ignorant of the event log. Reversing logic makes this more tricky since theDomain Model needs to store and retrieve the prior state, which makes it much more handy for theDomain Model to be aware of the event log.
As well as events playing themselves forwards, it’s also often useful for them to be able to reverse themselves.
Reversal is the most straightforward when the event is cast in the form of a difference. An example of this would be "add $10 to Martin’s account" as opposed to "set Martin’s account to $110". In the former case I can reverse by just subtracting $10, but in the latter case I don’t have enough information to recreate the past value of the account.
If the input events don’t follow the difference approach, then the event should ensure it stores everything needed for reversal during processing. You can do this by storing the previous values on any value that is changed, or by calculating and storing differences on the event.
This requirement to store has a significant consequence when the processing logic is inside a domain model, since the domain model may alter its internal state in ways which shouldn’t be visible to the event object’s processing. In this case it’s best to design the domain model to be aware of events and to be able to use them in order to store prior values.
It’s worth remembering that all the capabilities of reversing events can be done instead by reverting to a past snapshot and replaying the event stream. As a result reversal is never absolutely needed for functionality. However it may make a big difference to efficiency since you may often be in a position where reversing a few events is much more efficient than using forward play on a lot of events.
One of the tricky elements to Event Sourcing is how to deal with external systems that don’t follow this approach (and most don’t). You get problems when you are sending modifier messages to external systems and when you are receiving queries from other systems.
Many of the advantages of Event Sourcing stem from the ability to replay events at will, but if these events cause update messages to be sent to external systems, then things will go wrong because those external systems don’t know the difference between real processing and replays.
To handle this you’ll need to wrap any external systems with aGateway. This in itself isn’t too onerous since it’s a thoroughly good idea in any case. The gateway has to be a bit more sophisticated so it can deal with any replay processing that the Event Sourcing system is doing.
For rebuilds and temporal queries it’s usually sufficient for the gateways to be able to be disabled during the replay processing. You want to do this in a way that’s invisible to the domain logic. If the domain logic calls PaymentGateway.send it should do so whether or not you are in replay mode. The gateway should handle that distinction by having a reference to the event processor and checking the whether it’s in replay mode before passing the external call off to the outside world.
External updates get more complicated if you are usingRetroactive Event see the discussion there for gory details.
Another tactic that you might see with external systems is buffering the external notifications by time. It may be that we don’t need to make the external notification right away, instead we only need to do it at the end of the month. In this case we can reprocess more freely until that time appears. We can deal with this either by having gateways that store external messages till the release date, or triggering the external messages through a notification domain event rather than doing the notification immediately.
The primary problem with external queries is that the data that they return has an effect on the results on handling an event. If I ask for an exchange rate on December 5th and replay that event on December 20th, I will need the exchange rate on Dec 5 not the later one.
It may be that the external system can give me past data by asking for a value on a date. If it can, and we trust it to be reliable, then we can use that to ensure consistent replay. It also may be that we are usingEvent Collaboration, in which case all we have to ensure we retain the history of changes.
If we can’t use those simple plans then we have to do something a bit more involved. One approach is to design the gateway to the external system so that it remembers the responses to its queries and uses them during replay. To be complete this means that the response to every external query needs to be remembered. If the external data changes slowly it may be reasonable to only remember changes when values change.
Both queries and updates to external systems cause a lot of complication with Event Sourcing . You get the worst of both with interactions that involve both. Such an interaction might be a an external call that both returns a result (a query) but also causes a state change to the external system, such as submitting an order for delivery that return delivery information on that order.
So this discussion has made the assumption that the application processing the events stays the same. Clearly that’s not going to be the case. Events handle changes to data, what about changes to code?
We can think as three broad kinds of code changes here: new features, defect fixes, and temporal logic.
New features essentially add new capabilities to the system but don’t invalidate things that happened before. These can be added pretty freely at any time. If you want to take advantage of the new features with old events you can just reprocess the events and the new results pop up.
When reprocessing with new features you’ll usually want the external gateways turned off, which is the normal case. The exception is when the new features involve these gateways. Even then you may not want to notify for past events, if you do you’ll need to put some special handling in for the first reprocess of the old events. It’ll be kludgey, but you’ll only have to do it once.
Bug fixes occur when you look at past processing and realize it was incorrect. For internal stuff this is really easy to fix, all you need to do is make the fix and reprocess the events. Your application state is now fixed to what it should have been. For many situations this is really rather nice.
Again external gateways bring the complexity. Essentially the gateways need to track the difference between what happened with the bug, and what happens without it. The idea is similar to what needs to happen withRetroactive Events. Indeed if there’s a lot of reprocessing to consider it would be worth actually using theRetroactive Event mechanism to replace an event with itself, although to do that you’ll need to ensure the event can correctly reverse the buggy event as well as the correct one.
The third case is where the logic itself changes over time, a rule along the lines of "charge $10 before November 18 and $15 afterwords". This kind of stuff needs to actually go into the domain model itself. The domain model should be able to run events at any time with the correct rules for the event processing. You can do this with conditional logic, but this will get messy if you have much temporal logic. The better route is to hook strategy objects into aTemporal Property: something like
chargingRules.get(aDate).process(anEvent) . Take a look atAgreement Dispatcher for this kind of style.
There is potentially an overlap between dealing with bugs and temporal logic when old events need to be processed using the the buggy code. This may lead into bi-temporal behavior: "reverse this event according to the rules for Aug 1 that we had on Oct 1 and replace it according to the rules for Aug 1 that we have now". Clearly this stuff can get very messy, don’t go down this path unless you really need to.
Some of these issues can be handled by putting the code in the data. Using Adaptive Object Models that figure out the processing using configurations of objects is one way to do this. Another might be embed scripts into your data using some directly executable language that doesn’t require compilation – embedding JRuby into a Java app for example. Of course the danger here is keeping under proper configuration control. I would be inclined to do it by ensuring any change to the processing scripts was handled in the same way as any other update – through an event. (Although by now I’m certainly drifting away from observation to speculation.)
Events and Accounts
I’ve seen some particularly strong examples of Event Sourcing (and consequent patterns) in the context of accounting systems. The two have a very good synergy between them, both in their requirements (audit is very important for accounting systems) and in their implementation. A key factor here is that you can arrange things so that all the accounting consequences of aDomain Event are the creation ofAccounting Entrys and link these entries to the original event. This gives you a very good basis for tracking changes, reversal, and the like. In particular it simplified the various adjustment techniques.
Indeed one way of thinking about accounts is that theAccounting Entrys are a log of all of the events that change the value of anAccount, so that anAccount is itself an example of Event Sourcing .