Bloom filter
Probabilistic set: "definitely not" or "probably yes" in tiny space.
Origin
Burton Howard Bloom, MIT 1970. The original paper described it as a way to do spell-checking in 8 KB of RAM — a problem that sounds quaint until you remember the same trick now sits in front of every LSM-tree to decide whether to read from disk.
Where it shows up in production
- Cassandra & RocksDB A per-SSTable Bloom filter answers "could this key be here?" without touching disk.
- Chrome Safe Browsing A Bloom filter of malware URLs ships in the browser; full check only on a "maybe."
- Bitcoin SPV Light wallets send a Bloom filter to full nodes so they only see relevant transactions.
On Semicolony
Sources & further reading
Found this useful?