On ES/CQRS: the good, the bad, an alternative

Recently I saw yet another blog post explaining and advocating for ES/CQRS, and of course also rationalising and trying to work around its shortcomings (no transactions, eventual consistency).

Fwiw, the post is well written, and I don’t want to pick on it. It just happens to be the latest one I encountered.

I don’t intend to explain ES/CQRS here. I’m not even going to write out the acronyms. The post assumes you know a bit about the subject already.

The good

ES/CQRS comes with a set of nice properties:

allows building scalable systems
all changes are captured in the event log, so history can be audited, reinstated, …
allows separate query models/databases

The bad

Unfortunately there are also a few downsides:

eventual consistency for the query components
lack of transactions involving multiple entities

Some of that last point is also known as the ES set validation problem. There’s basically a lot of talk about ways to work around the lack of transactions. They boil down to the following:

choose your aggregates differently (bigger?)
rethink your business domain
use sagas
“but but … why do you want to do this, there are off the shelve solutions for this”

In my opinion these are all quite weak. (I shouldn’t have to do the first 2, number 3 is a way of building your own custom transactions, and that last one I barely want to go into, we should be more ambitious and use approaches that enable more use cases!)

It seems to me that ES/CQRS proponents believe that transactions are not scalable, and from that position they want to argue that you don’t need them.

For me however it’s quite clear: transactions are a huge convenience, and if you’re using a system without them you’re making life needlessly hard on yourself. If you’re not convinced about that, go read FoundationDBs Transaction Manifesto!

Also by now it has become clear that, transactions do scale.

An alternative

So what should one do if one wants the upsides of ES/CQRS, yet have transactions available?

The route I would choose is starting with a scalable transactional database (e.g. one of the above) that has built-in change data capture (CDC). (Note: I haven’t verified if all of the above allow CDC.) Built-in CDC basically means that the database exposes a stream of (timestamped) events for all changes that are made to the database state.

Using the change data capture one can (relatively easy) build up alternative query models, export the data into other systems, keep old versions around, …

In short, I think one can get the ES/CQRS positives without throwing transactions away.

(Or find yourself the ultimate database that offers all of this in a convenient package.)

What about eventual consistency of the separate query components?

That is indeed still a problem in the above approach.

I could try to argue here about whether you really need separate query components … and probably some people have them for wrong reasons, but there are certainly some good reasons too. Such as dumping everything into your data lake / analytical database. Or pushing the data into a system that is good at doing full text searches.

So can we solve that problem too? Yes, I believe we can!

Using the watermarks concept from stream processing it should be possible to know what the propagation delay is towards this other system. And once you can track it, you can also wait it out.

My hope is to eventually do some work towards this in my streamy-db project (introductory blog).