Distributed Systems

Consistent hashing

Hash both keys and nodes onto a ring; nodes own arcs.


In plain terms

Karger et al, MIT 1997. Adding/removing a node only moves K/N keys, not the whole keyspace. Powers every distributed cache and CDN.

Origin

David Karger, Eric Lehman, Tom Leighton, Matt Levine, Daniel Lewin, and Rina Panigrahy, MIT 1997. The paper "Consistent Hashing and Random Trees" was written for a different problem (web caching) but the idea ended up under every distributed cache, every CDN, and every Dynamo-style store. Lewin co-founded Akamai a year later.

Where it shows up in production
  • Akamai CDN The original commercial application — assigns clients to edge caches.
  • Amazon Dynamo / Cassandra Ring with virtual nodes; each replica owns a contiguous range of the ring.
  • Memcached client libraries Maps cache keys to servers so removing one server only invalidates 1/N of keys.
On Semicolony
Sources & further reading
Found this useful?