Decision tree
Internals / Decision tree

Choosing a database

The most-asked question in any system-design round, and the one most candidates answer with "I'd use Postgres" without showing why. A decision tree that gets to a named system in five questions, plus an honest look at the alternatives, because in real life the wrong choice is a multi-quarter rewrite, not a five-second regret.


The first question — what's the access pattern?

Everything else hangs off this. Five access patterns that map cleanly onto storage families:

OLTP — when the answer is "Postgres"

Online transaction processing. Many small reads and writes, foreign keys, multi-statement transactions, joins. The default for almost any product database. Default to Postgres unless you can defend something else.

ChoiceWhenTrade-off
PostgresDefault. Up to ~50K writes/s on a single primary; vertical scale to ~10 TB without sharding.Single-region; vertical scaling has a ceiling.
MySQLExisting investment, MySQL tooling, or specific replication needs.Less rich SQL than Postgres; isolation defaults are weaker.
CockroachDB / YugabyteDBNeed horizontal scale + transactions + multi-region and Postgres-compatible SQL.~5× the cost of Postgres; harder to operate.
Spanner / AlloyDBYou're on GCP and need globally consistent transactions.Cloud lock-in; SQL surface less rich than Postgres.
RDS AuroraYou're on AWS and want Postgres-or-MySQL with managed HA.Same Postgres, more $; storage is faster than self-managed.
Don't reach for distributed SQL too early. Postgres handles 99% of OLTP workloads on a single node. The interesting threshold isn't transactions per second; it's "do we need cross-region writes". Until you cross that line, single-primary Postgres + read replicas is faster, cheaper, and easier to debug.

KV — when the answer is "DynamoDB" (or Cassandra, or Redis)

Pure key-value access: get, put, delete by key. No joins, no range scans, no transactions across keys. The shape that scales horizontally with the least fuss.

ChoiceWhenTrade-off
DynamoDBYou're on AWS, want managed, want serverless billing. The default cloud KV.Per-item cost adds up; transactions exist but are limited.
Cassandra / ScyllaDBSelf-host, multi-region, very high write throughput (millions/s).Operational complexity; tombstone GC; eventual consistency.
RedisIn-memory; sub-millisecond reads; ephemeral or cache-shaped data.RAM-bound; single-thread per shard means hot keys hurt.
FoundationDBNeed ACID transactions on KV at scale. Apple iCloud uses it.Smaller community; the layer model means apps rebuild conventions.
Bigtable / HBaseMassive, range-keyed (row-major) workloads on GCP / Hadoop.Old-school; not a default for new systems.

OLAP — when the answer is "BigQuery" (or Snowflake, or ClickHouse)

Analytical workloads: large scans, aggregations, joins over billions of rows, long-running queries. Columnar storage, parallel execution, and decoupled compute / storage.

ChoiceWhenTrade-off
BigQueryYou're on GCP and want zero-ops analytics. Pay per query.Per-query pricing surprises; cost discipline matters.
SnowflakeCloud-agnostic warehouse; multi-tenant compute; rich SQL.Expensive at high scale; vendor-managed.
ClickHouseYou want millisecond-latency aggregations, self-host, you're OK doing the operational work.SQL dialect is its own thing; less mature than the warehouses.
DuckDBLocal / embedded analytics, single-machine. Great for testing and notebooks.Single-node; not a production warehouse.
Druid / PinotReal-time analytics — sub-second queries on streaming data, faceted dashboards.Heavy operational footprint; specialised tool.
RedshiftYou're on AWS and want a warehouse with deep AWS integration.The "old AWS" tool; new projects usually pick Snowflake or BigQuery.
OLTP and OLAP almost never live in the same system. When the interview asks "where does the analytics dashboard read from?", the answer is "a separate OLAP store, fed by CDC from the OLTP database". Trying to query Postgres for a year of revenue at the row level is the failure mode.

Time-series — when the answer is "Prometheus" or "TimescaleDB"

A narrow but important case: each row keyed by (timestamp, label-set, value). Heavy writes, append-only, range reads dominate.

ChoiceWhen
PrometheusOperational metrics. Pull model, label-set indexed. Default for cloud-native infra.
VictoriaMetricsPrometheus-compatible at much higher scale. Drop-in for big fleets.
TimescaleDBPostgres extension. You want SQL plus time-series compression.
InfluxDBIoT, fewer host metrics, more event payloads. Flux is its own query language.
QuestDBNiche but fast. SQL-compatible, columnar, time-series-first.

Search — when the answer is "Elasticsearch" or a vector DB

Inverted-index lookups (full-text), faceted search, autocomplete, and the new class of vector / embedding retrieval. A different shape from any of the above: the index lives in RAM, and the query is a ranking function.

ChoiceWhen
Elasticsearch / OpenSearchFull-text search, log analytics. The default for both.
VespaRanked search at scale with structured features. Yahoo / Verizon stack.
Meilisearch / TypesenseEmbedded-style search for small-to-medium SaaS.
pgvectorVector search inside Postgres. Default for any team already on Postgres.
Pinecone / Weaviate / Milvus / QdrantSpecialised vector DBs. Pick when scale or features beyond pgvector are required.
Don't use Elasticsearch as a primary store. Use it as a denormalised view fed by your OLTP. The most common production failure: ES is the only place some piece of data lives, ES has a bug, the data is gone.

Graph — when the answer is "actually, you don't need a graph DB"

The contrarian answer that interviewers like. Most "graph" use cases (friend lists, follower graphs, permission inheritance) are recursive joins solved cleanly with adjacency tables in Postgres. A real graph database earns its keep only when:

  • Queries traverse many hops (5+) and the depth is variable.
  • The data has rich edge properties and edge types matter for the query.
  • The product needs Cypher / Gremlin / SPARQL ergonomics.
ChoiceWhen
Neo4jThe default. Cypher is the most-readable graph DSL.
TigerGraphMassive scale; commercial.
dgraphOpen-source; GraphQL-as-a-database.
JanusGraphOlder; backed by Cassandra/HBase. Last resort for "we're already on Cassandra".
Adjacency table in PostgresThe right answer for ~80% of graph-shaped problems. WITH RECURSIVE handles deep traversals; CTEs are well-optimised.

The decision tree, in five questions

Walk these in order. The first that doesn't fit narrows the choice.

  1. Is the workload analytical (large scans, aggregations) or transactional (small reads/writes)? Analytical → OLAP family. Transactional → next question.
  2. Do you need joins and multi-statement transactions? Yes → OLTP (Postgres / MySQL). No → next question.
  3. Is the data shape pure key-value, or do you need range scans / secondary indexes? Pure KV → DynamoDB / Cassandra / Redis. Range / indexes → Postgres anyway, or column-family stores like Bigtable.
  4. Is search (full-text, ranking, vector) the dominant access pattern? Yes → Elasticsearch / pgvector / Pinecone. No → previous answer holds.
  5. Do you need cross-region transactions or strict global ordering? Yes → Spanner / CockroachDB / YugabyteDB. No → single-region Postgres.

And for time-series specifically: heavy time-keyed writes + range reads = the time-series family. Don't try to retrofit Postgres for hundreds of millions of metrics-per-second; you'll lose.

Polyglot persistence — and the trap

Real systems use multiple databases. The OLTP store, the OLAP warehouse, the cache, the search index, the time-series metrics. The pattern is:

  1. OLTP is the source of truth. Everything else is a derived view.
  2. CDC (change-data-capture) feeds the derived stores. Debezium for Postgres / MySQL → Kafka → consumers populate ES, ClickHouse, Redis.
  3. One write, many reads. Application writes only to the OLTP. Reads go wherever the access pattern is best served.
  4. Reconciliation jobs catch drift. The derived stores eventually fall out of sync with reality; nightly or weekly jobs validate and repair.
The trap is having two systems both claim to be the source of truth. "OLTP and Elasticsearch both have the user record" is a recipe for two answers to "what's the user's email" and the inability to ever resolve which is right. Pick one source; everything else is a derived projection.

The rewrite cost — when the choice is wrong

Picking the wrong database is one of the most expensive mistakes a system can make. The cost isn't the database. It's the rewrite to escape it.

Wrong choiceSymptomRewrite shape
Postgres for OLAPDashboard queries take minutes; OLTP suffers.Add CDC + warehouse; rewrite reporting layer. ~1 quarter.
MongoDB for relational dataJoins via app-side aggregation; consistency bugs.Migrate to Postgres; rewrite all read paths. ~2 quarters.
Cassandra for transactionsLost writes; consistency anomalies.Migrate to Postgres or CockroachDB; rewrite the consistency-sensitive paths. ~2 quarters.
Single-region Postgres for global productCross-continent writes are slow; failover loses minutes.Spanner / CockroachDB migration. ~3+ quarters.
Elasticsearch as primaryIndex corruption loses data; no schema migration story.Move source-of-truth to Postgres + ES as derived. ~1 quarter.
Distributed SQL too early2× the cost; harder to debug; slower than single-node Postgres for current load.Often "live with it" — the cost of changing back is higher than the cost of running it.

The interview answer that lands

In a system-design round, "what database would you use" is the cue. The candidate who passes:

  1. States the access pattern explicitly. "This is read-heavy OLTP with occasional analytical queries."
  2. Picks a default and defends it. "I'd start with Postgres because I have ≤ 50K writes/s, Postgres handles that on a single primary, and the team knows it."
  3. Names what would change the answer. "If write throughput crossed 100K/s I'd shard with Citus or move to CockroachDB. If we needed cross-region, Spanner."
  4. Explains the analytics path separately. "Analytics queries go to a warehouse fed by CDC — BigQuery or ClickHouse depending on cloud."

And the closing thought: "The default is Postgres until I have a reason it doesn't work. The interesting question isn't 'which database' — it's 'which constraint forces me off Postgres'." That sentence is the senior framing.

Further reading

  • Kleppmann — Designing Data-Intensive Applications, Chapters 3–4 & 7. The textbook on storage engines and isolation. Read these chapters before any design round.
  • Petrov — Database Internals. Pairs with DDIA; goes deeper on B-trees, LSM, and replication primitives.
  • Helland — "Life Beyond Distributed Transactions" (2007). The canonical "stop trying to do this" essay. Aged extremely well.
  • Helland — "Immutability Changes Everything" (CIDR 2015). The CDC + derived-store architecture, articulated.
  • Spanner paper (Corbett et al, 2012). The reference for "yes you can do globally consistent transactions; here's the cost".
  • Dynamo paper (DeCandia et al, 2007). The reference for the KV family.
  • Adjacent: Databases study path. The full curriculum.
  • Adjacent: Distributed KV store. The KV-family system design walkthrough.
  • Adjacent: Replication. The replication-shape choice that's downstream of database choice.
Found this useful?