Distributed systems topics
Replication, consensus, time, CAP, consistency models, delivery semantics, quorum, leases, failure detection, gossip, CRDTs, sharding, service discovery, snapshots, streaming. Twenty topics that every distributed system has to take a position on. Each sub-page below works through one: the algorithm, the proof when there is one, and what it means for code that runs on more than one machine.
Twenty sub-pages. Each is a long-form walkthrough with papers, code, and the references the field actually uses: Lamport, Lynch, Brewer, Abadi, Shapiro, Kleppmann.
The twenty topics
Replication
Single-leader, multi-leader, leaderless. Synchronous vs asynchronous, log shipping vs row-based, the read-your-writes guarantee. The shape every replicated database has to choose.
Consensus — Paxos & Raft
Why agreement is hard, why FLP says it's impossible, and why we ship Paxos and Raft anyway. Single-decree Paxos, Multi-Paxos, the Raft state machine, joint consensus.
Time & clocks
Wall clocks lie. Lamport clocks order events, vector clocks detect concurrency, hybrid logical clocks combine both, TrueTime bounds uncertainty. The arithmetic of "happens-before".
CAP & PACELC
Brewer's theorem in plain words, Abadi's PACELC refinement, and why every database vendor's marketing copy gets this wrong. What you actually trade away — and when.
Consistency models
The hierarchy readers actually care about. Linearizability, sequential consistency, causal consistency, eventual consistency — what each one promises, what it forbids, and why "Serializable" in your SQL database is a different axis entirely.
Idempotence at scale
At-least-once delivery is the rule, exactly-once is a marketing term. How idempotency keys, deduplication windows, and the outbox pattern give you exactly-once semantics on top.
Delivery semantics
At-most-once, at-least-once, exactly-once. Why true exactly-once delivery is impossible over a lossy network, and how producers, brokers, and consumers combine dedup, idempotent writes, and transactions to fake it convincingly.
Quorums & majorities
Why N/2+1, why Dynamo-style W+R>N, why flexible Paxos lets you pick. The math behind how many nodes have to say yes before you can make progress.
Leases & fencing
Locks across the network are a lie unless you fence them. Why leases beat distributed locks, how monotonic fencing tokens save you from pause-the-world GCs.
Leader election
How a cluster picks one node to make decisions. Bully, Raft elections, ZooKeeper sequential nodes, etcd leases. Split-brain, fencing tokens, and the subtleties of "I think I am still the leader".
Failure detectors
Heartbeats are the cheapest answer and rarely the right one. Chandra-Toueg classes, phi-accrual (Cassandra), SWIM (Serf, Consul), and the difference between "I think you are dead" and "the system has agreed you are dead".
Gossip protocols
Membership, failure detection, anti-entropy, SWIM. How clusters of thousands learn who is alive without a coordinator and without bringing the network down.
Anti-entropy & read repair
How leaderless stores heal divergent replicas. Read repair on the request path, hinted handoff while a node is down, and Merkle-tree anti-entropy that compares whole key ranges without shipping all the data.
CRDTs
Conflict-free replicated data types. The algebra of merge — join-semilattices where the merge is associative, commutative, and idempotent — so replicas converge without coordination. State-based vs op-based, counters, sets, sequences, and what collaborative editors run.
Sharding & partitioning
Range, hash, consistent-hash, rendezvous. The four ways to split data across machines, the trade-offs each makes, and why resharding is the hardest operations problem in this list.
Service discovery
How a service finds the live instances of another. Client-side vs server-side discovery, the registry (Consul, etcd, ZooKeeper, Eureka), health checks, DNS vs a control plane, and why stale registries cause more outages than dead nodes.
Two-phase commit & sagas
Atomic commit across services. The blocking problem in 2PC, the compensation problem in sagas, and why every microservices system eventually picks one. XA, the outbox pattern, choreography vs orchestration.
Distributed snapshots
A consistent cut across a running distributed system. Chandy-Lamport with marker tokens, Lai-Yang, Mattern, and the asynchronous-barrier variant Flink uses to checkpoint stateful streams without stopping the world.
Streaming systems
Why streaming is not just fast batch. Event time vs processing time, watermarks, the four windows, exactly-once as state semantics, and what Flink, Kafka Streams, and Spark Structured Streaming each chose when they sat down to ship.
Back-pressure, retries, hedging, deadlines
The four reliability primitives as a unit. Back-pressure for capacity, retries for transient failure, hedging for tail latency, deadlines for budget. The single most common L5 system-design losing topic — usually because candidates handle one of the four and forget the other three.