Multi-leader database replication. Is it all worth it?

3 min readJan 18, 2024

Why do we require multi-leader databases?

Support writes in multi-datacenter to increase performance across data centers.
Tolerance of network problems across data centers.
Independent operation of all other data centers even if one data center goes down

But there are always tradeoffs.

How should we handle writing conflicts across leaders? Collision of auto-incrementing keys? Fix messed-up data across multiple leaders because of any bug or issue. and more

The simplest way to solve a problem is not to have pain in the first place

How can we not have conflicts in multi-leaders?

If we ensure that writing to a particular leader always goes through the same leader, then conflicts won’t occur. This is increasing the write performance across the data centers but what if one data center fails, we have to ultimately deal with conflicts :(

Ways to deal with conflicts

Now there are two ways to resolve conflicts

Application resolves conflict: The application detects the conflicts and a snippet of code resolves it or the database can do it. One of the ways is the last-to-write wins (LWW) technique.
Let me know in the comments if want a blog to discuss the various techniques. I would be happy to do that :)
User resolve conflicts: Store all the conflict versions and next time the user read the data, prompt the user and write the result back to the database.

How do multiple leaders communicate among themselves?

Circular topology

Each node writes from one node and forwards those writes to another node

Star topology

One designated root node forwards write to all the other nodes.

In such architectures, there is a possibility that it leads to infinite replication loops. They are infinitely looping across all the nodes.

How to avoid infinite replication loops?

Each node has a unique identifier and the replication log is tagged with the identifiers of all the nodes it is passed through.
And whenever a node receives a replication that has its identifier, the data change is ignored as it knows it has already been processed, and infinite looping is prevented

But there is one more issue over here, a single point of failure.

If one node fails it will affect other nodes as well.

How to prevent a single point of failure?

All to all

So here fault tolerance is better because messages can travel along a different path, avoiding a single point of failure.

But definitely, it comes with its complexities, like one network link may be faster than others and some messages may overtake others and hence lead to data inconsistencies in a concurrent environment. We can use version vectors for these.

Version vector is a whole different blog, hence not discussed here.

This might have given you a fair amount of multi-leader architecture, so do you think is all this worth it? Or should we avoid multileader?

Reference:

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems: by Martin Kleppmann