Managed databases.

Once you've decided to put your data on someone else's hardware, the next decision is which shape of database. Managed Postgres covers most needs. DynamoDB-shape covers most of the others. Specialised engines — graph, time series, vector — show up in narrower spots. The boring answer is usually the right one: pick managed Postgres unless you have a clear reason not to.

1 · The shapes

Relational (SQL). Tables, rows, joins, ACID transactions. The right default for anything with structured relationships. Postgres or MySQL underneath, almost always.
Key-value. Get / put by key. Massive scale, low latency, no joins. Sessions, rate limits, hot lookups.
Document. JSON-ish documents indexed by collection. Flexible schema, OK for nested data, poor for joins.
Wide-column. Sparse, partitioned tables (think Cassandra-shape). Very high write throughput; querying limited to the partition key plus secondary indexes you defined up front.
Graph. Nodes and edges, queries that traverse relationships. Niche: fraud detection, recommendations, knowledge graphs.
Time-series. Optimised for append-only time-indexed data. Metrics, IoT telemetry, financial ticks.
Vector. Embeddings + nearest-neighbour search. Newer category; pretty much every database now claims to do this.

2 · The AWS canonical version

Shape	AWS service	Notes
Relational (Postgres / MySQL)	RDS	Managed engine on EC2 underneath. Patches, backups, Multi-AZ failover. The boring, safe default.
Relational, cloud-native	Aurora (Postgres or MySQL compatible)	AWS-rewritten storage engine. 3–5× faster than RDS, more expensive, better failover. Default for new builds at any scale.
Relational, serverless	Aurora Serverless v2	Scales capacity per second. Good for spiky/dev workloads; not always cheaper than provisioned.
Key-value / document	DynamoDB	Fully managed, single-digit-ms latency at any scale, pay per request or provisioned. The right pick for the "I need a fast hash table at planet scale" problem.
Document (Mongo API)	DocumentDB	Mongo-compatible, AWS-managed. Use it if you want the Mongo programming model without running Mongo.
Wide-column	Keyspaces (Cassandra API)	Cassandra-compatible, serverless. Replaces self-managed Cassandra for the same workloads.
Search	OpenSearch	Fork of Elasticsearch. Logs, full-text search, dashboards.
Cache	ElastiCache (Redis / Memcached)	The fast in-front-of-DB layer. Redis for everything serious, Memcached for the rare cases you need shared memory only.
Time-series	Timestream	Append-only, time-partitioned. Mostly used in IoT pipelines.
Graph	Neptune	Property graph + RDF. Niche.
Vector / embeddings	RDS pgvector, OpenSearch k-NN, Aurora ML, plus standalone (Pinecone / Weaviate)	Pick the one your existing DB already supports unless you have a serious vector workload.
Analytics	Redshift, Athena (serverless on S3)	Redshift for warehouse, Athena when "warehouse" is overkill.

3 · GCP and Azure equivalents

Shape	AWS	GCP	Azure
Managed Postgres / MySQL	RDS / Aurora	Cloud SQL / AlloyDB (Aurora-shape)	Azure DB for PostgreSQL / MySQL
Globally consistent SQL	Aurora Global / Aurora DSQL	Spanner	Cosmos DB (SQL API) with strong
Key-value / doc, single-digit-ms	DynamoDB	Firestore (in Datastore mode) / Bigtable	Cosmos DB
Document (Mongo)	DocumentDB	Firestore (Native mode) / MongoDB Atlas (third-party)	Cosmos DB (Mongo API)
Wide-column	Keyspaces	Bigtable	Cosmos DB (Cassandra API)
Search	OpenSearch	Elasticsearch (3rd party) / Cloud Search	Azure AI Search
Cache	ElastiCache	Memorystore (Redis / Memcached)	Azure Cache for Redis
Warehouse	Redshift	BigQuery	Synapse Analytics / Fabric
Time-series	Timestream	Bigtable + tooling, or InfluxDB on GCE	Azure Data Explorer (ADX)
Graph	Neptune	(No first-party; use Neo4j on GKE)	Cosmos DB (Gremlin API)

Spanner and BigQuery are the GCP standouts. Spanner is the only commercially available globally-linearisable RDBMS — it's what Google's AdWords runs on. BigQuery is the most ergonomic data warehouse on the market by a comfortable margin. Both are reasons to pick GCP for a specific workload even in an AWS-default shop.

4 · How to pick

Does the data have relationships you'll want to query (joins)? Managed Postgres. Almost always Aurora-shape for new builds.
Is the access pattern a key lookup at huge scale with sub-10ms P99? DynamoDB / Firestore / Cosmos. Plan your access patterns up front; you can't add ad-hoc queries later without a redesign.
Do you need ACID transactions across globally-distributed regions? Spanner. Aurora DSQL (AWS's newer entry in the same space). CockroachDB self-managed if multi-cloud.
Is it append-heavy time-indexed data? Timestream, ADX, or Postgres with TimescaleDB extension.
Is it search-shaped (full text, faceting, log analytics)? OpenSearch / Azure AI Search.
Is it a warehouse query (large scans, OLAP)? Redshift / BigQuery / Snowflake. Don't run OLAP on your transactional DB past a certain size.

The decision worth defending. "Pick managed Postgres unless you have a specific reason not to." Postgres is the default at every scale up to billions of rows; it handles JSON, full-text, geospatial, and vector workloads via extensions. The interesting question in a design interview is which specific workload would not fit Postgres, and why.

5 · What breaks

RDS storage runs out. Disk fills up over a weekend; instance goes into storage-full state; nobody can write. Mitigation: enable storage auto-scaling. (Aurora is decoupled from storage and doesn't have this problem.)
DynamoDB hot partition. If your partition key isn't well-distributed (e.g. all writes go to user_123), you'll see throttling. The fix is a better partition key, not more capacity.
Aurora connection limit. Aurora limits connections by instance size. A poorly-tuned connection pool (or no pool, looking at you Lambda) hits the ceiling first. RDS Proxy or pgbouncer in between.
DynamoDB scan. The escape hatch for "I forgot to design my access pattern." Cheap in dev, ruinous in production at scale. Real queries hit indexes; scans don't.
Aurora reader lag. Read replicas are eventually consistent (single-digit ms typically, but spikes). Read-your-own-writes from a replica is the most-debugged bug in cloud-Postgres setups. Pin recent writes to the primary or use the cluster endpoint.
BigQuery / Redshift cost spike. A single bad query can scan terabytes. Mitigations: BigQuery slot reservations, Redshift workload management, query review at code-review time.

6 · Cost note

Database is often the biggest line on the cloud bill after compute. Three things to watch:

RDS/Aurora reserved instances. Same 30–60% savings story as compute. Steady-state DB instances should be reserved, full stop.
DynamoDB on-demand vs provisioned. On-demand is convenient and 5–7× more expensive per request than well-tuned provisioned. Tables with predictable traffic should be provisioned with auto-scaling.
Snapshots, backups, point-in-time recovery. All cost money. PITR especially is a per-GB-month charge that adds up on big DBs. Set retention deliberately, not at "default forever."

Managed databases.

1 · The shapes

2 · The AWS canonical version

3 · GCP and Azure equivalents

4 · How to pick

5 · What breaks

6 · Cost note

Further reading

Multi-region →