Raft leader election.
Five nodes, one job: agree on a single leader who handles all writes. When the leader dies, agree on a new one. Watch the algorithm pick a leader, replicate a command, lose the leader to a crash, and re-elect — all in 8 steps.
Cluster just booted, or the old leader just died. Every node is a follower in term 1. Each runs an election timeout — randomized between 150 ms and 300 ms. Whoever's timeout fires first will become a candidate.
- Raft
- A consensus algorithm designed to be understandable. Picks a leader, replicates a log of operations through the leader, handles failures via re-election. Used by etcd, Consul, CockroachDB, TiKV.
- Term
- A logical clock that increments on each election attempt. Two messages from different terms are always distinguishable. Old leaders' commands can't corrupt new ones.
Why Raft beats Paxos in practice
Paxos is older, more general, and notoriously hard to implement correctly — the original paper omitted critical details that took years of follow-up papers to fill in. Raft decomposes the problem into three subproblems (leader election, log replication, safety) and pins down every state transition. The result is dozens of independent implementations that interoperate, and a much smaller surface for bugs. Real systems matter more than mathematical elegance.
The replication invariants
Raft guarantees: (1) at most one leader per term; (2) leaders never overwrite their own log; (3) followers never accept entries from a stale leader (older term); (4) a committed entry stays in the log of all future leaders forever. The leader-completeness property — a new leader must have all committed entries — is enforced by requiring candidates\' logs to be at least as up-to-date as a voter\'s before the voter grants a vote.
Where you\'ve been using Raft already
etcd (Kubernetes\' control plane storage), Consul (service discovery), CockroachDB and TiKV (distributed SQL), HashiCorp Vault, MongoDB\'s replica sets (Raft-inspired), Apache Kafka (since 3.3, replacing ZooKeeper). Anytime "leader election" appears in a system\'s docs, it\'s probably Raft underneath. The algorithm has become the load-bearing pillar of the modern distributed-systems stack.
Consensus internals →
Raft\'s safety arguments, log compaction, configuration changes (joint consensus), Multi-Paxos vs Raft, real implementations\' tweaks (etcd\'s pre-vote, learner nodes).
Open the Codex →