Storage & Databases

Wide-column store

Schema is dynamic per row; many sparse columns.


In plain terms

Bigtable, HBase, Cassandra, ScyllaDB. Optimised for time-series and event data where most rows have most columns null.

Origin

Bigtable (Chang et al., OSDI 2006) defined the category. Cassandra (2008) and HBase (2008) were the first OSS implementations.

Where it shows up in production
  • Apache Cassandra Default DB at Apple, Discord, Netflix for time-series-shaped workloads.
  • Bigtable / HBase Same shape, Google or Hadoop ecosystem respectively.
  • ScyllaDB C++ rewrite of Cassandra; same column-family model, 5-10× the per-node throughput.
On Semicolony
Sources & further reading
Found this useful?