Isolation levels
Transactions never run alone. They interleave, and the order they interleave in can produce surprises: rows that vanish, totals that don't balance, two doctors going off-call at the same time. An isolation level is a contract about which surprises the database is allowed to show you. Here is the ANSI ladder, what it actually means, and what every major vendor really ships under each name.
Transactions don't run alone —
they interleave.
A database that ran one transaction at a time would be easy to reason about and far too slow. So engines run many transactions at once, interleaving their reads and writes against shared rows. The trouble is that some interleavings produce results no single-threaded run ever could: a row that exists in one query and not the next, an update that silently disappears, a global invariant that quietly breaks.
The textbook ideal is serializable: every concurrent schedule must produce the same result as some serial order of the same transactions. Serializable is expensive, so SQL gives you a knob, the isolation level, to trade correctness for throughput. Picking a level means deciding, ahead of time, which classes of anomaly you're willing to let your application see.
The anomalies, with concrete examples.
There are five anomalies worth knowing by name. Four come from the ANSI standard. The fifth, write skew, was named by Berenson & Gray in 1995 because the original list missed it.
- Dirty read. T1 reads a row T2 has written but not yet committed. If T2 rolls back, T1 used a value that never existed. Example: a reporting job reads a balance of $0 because a transfer is mid-flight, the transfer aborts, and the report is wrong.
- Non-repeatable read. T1 reads row R, T2 updates R and commits, T1 reads R again and sees a different value. The two reads inside one transaction disagree.
- Phantom read. T1 reads a range (e.g.
WHERE age > 30), T2 inserts a new row inside that range and commits, T1 re-runs the same query and sees a row that wasn't there before. A non-repeatable read on a set instead of a single row. - Lost update. T1 and T2 both read x = 10, both compute x + 1, both write 11. One of the two writes is silently overwritten, so a +1 evaporates. Classic on counter columns updated with read-modify-write.
- Write skew. Two transactions each read a set of rows, each writes to a disjoint row, and the writes together break a constraint that neither write would break on its own. The canonical example below uses on-call doctors.
The ANSI SQL-92 ladder.
SQL-92 defines four isolation levels, each one preventing one more class of anomaly than the level below it. You set the level per transaction:
-- ANSI SQL-92 syntax. Most engines support some variant.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Postgres / MySQL: also accepted at session scope
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- ... your statements ...
COMMIT;The intended ladder, as the standard reads it:
| Level | Dirty read | Non-repeatable | Phantom |
|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible |
| Read Committed | Prevented | Possible | Possible |
| Repeatable Read | Prevented | Prevented | Possible |
| Serializable | Prevented | Prevented | Prevented |
What ANSI actually means is ambiguous.
Berenson, Bernstein, Gray, Melton, O'Neil & O'Neil's 1995 paper A Critique of ANSI SQL Isolation Levels showed that the ANSI definitions, taken literally, permit anomalies the standard's authors clearly didn't intend. The phenomena are defined as forbidden read/write sequences, not as the schedules they actually rule out.
Two consequences. First, the same level name on two vendors can prevent different anomalies. Second, snapshot isolation, the dominant implementation technique in MVCC engines, isn't on the ladder at all. By the ANSI text, SI looks like Serializable (it prevents the three named anomalies). By any reasonable definition, it isn't: it permits write skew.
What every major DB actually delivers.
The names on the tin don't map cleanly to the standard. The table below shows what each engine actually does when you ask for a level by name.
| Engine | Default level | What it really is |
|---|---|---|
| Postgres | Read Committed | RC via MVCC. "Repeatable Read" is snapshot isolation. "Serializable" is SSI (Cahill 2008). |
| MySQL / InnoDB | Repeatable Read | Closer to SI than to ANSI RR. Phantoms appear unless you take next-key locks with SELECT ... FOR UPDATE. |
| Oracle | Read Committed | RC via MVCC. "Serializable" is snapshot isolation — has been for 30 years. No true serializable level. |
| SQL Server | Read Committed (lock-based) | Full ANSI ladder via locks, plus optional Snapshot and Read Committed Snapshot. |
| CockroachDB | Serializable | Serializable by default, via its own MVCC + timestamp scheme (not Postgres-style SSI). READ COMMITTED added in v23.2. |
| Spanner | Serializable | Strict serializable — uses TrueTime to give external consistency. |
| FaunaDB | Serializable | Strict serializable by default. |
Three engines, three different things behind the word "Serializable." Worth knowing before your next migration.
Snapshot isolation, in depth.
Snapshot isolation gives each transaction a consistent view of the database as it was the instant the transaction began. Reads never block, never see uncommitted data, and never see writes that happened after the start timestamp. Writes use first-committer-wins: if two transactions modify the same row, the second to commit is aborted. Built on MVCC; cheap, popular, and almost-but-not-quite serializable.
The classic counterexample is the on-call doctors. Constraint: at least one doctor must stay on call. Two doctors, Alice and Bob, are both on call. Both decide to go off call at the same time:
-- Invariant: COUNT(*) WHERE on_call = true MUST be >= 1.
-- Initial state: Alice and Bob are both on_call = true.
-- Transaction A (Alice) -- Transaction B (Bob)
BEGIN; BEGIN;
SELECT COUNT(*) FROM doctors SELECT COUNT(*) FROM doctors
WHERE on_call = true; WHERE on_call = true;
-- returns 2 -- returns 2
UPDATE doctors SET on_call = false UPDATE doctors SET on_call = false
WHERE name = 'Alice'; WHERE name = 'Bob';
COMMIT; COMMIT;
-- Both commit under SI. Both writes touched disjoint rows, so
-- first-committer-wins does not fire. Invariant: violated. No doctor
-- is on call. Under true serializable, one transaction would abort.Serializable Snapshot Isolation (SSI).
Cahill, Röhm & Fekete's 2008 paper "Serializable Isolation for Snapshot Databases" showed how to make SI actually serializable at modest cost. The insight: serializability under SI breaks only when a transaction graph contains two consecutive rw-antidependencies in a cycle: a read by T1 that conflicts with a write by T2, followed by a read by T2 that conflicts with a write by T3.
SSI tracks read and write sets per transaction at commit time, flags transactions that
form a dangerous pattern, and aborts one of them. Postgres shipped this in 9.1 (2011)
as the SERIALIZABLE level. It costs roughly a hash-table of read-locks per
active transaction and an extra commit-time check — no lock contention, no row locks for
readers.
The trade-off is that aborts are now part of the contract. Code that uses Serializable
has to be ready to retry on a 40001 serialization_failure. Most ORMs don't
retry by default; you write the loop yourself.
The performance cost.
On Postgres, switching a write-heavy OLTP workload from Read Committed to Serializable typically costs 10–30% throughput, depending on contention. Cahill's original paper measured ~10% on TPC-C. Real workloads with hot rows can see 2–10× degradation because the abort-and-retry loop eats CPU when contention is high.
Lock-based serializable (SQL Server's default SERIALIZABLE) is usually
worse, since range locks block both readers and writers. MVCC-based SSI is the right baseline
to compare against if you're considering escalation.
WHERE version = ?. Save the strong level for the cases where you
really do encode an invariant across multiple rows.Choosing a level.
A practical rule that holds up:
- Default to Read Committed. It's what Postgres, Oracle, and SQL Server default to, it's fast, and it's enough for most application code.
- Use
SELECT … FOR UPDATEon hot rows where two transactions might race on the same row (counters, inventory, balances). This puts a single row on pessimistic locking without changing the level globally. - Escalate to Serializable when business logic encodes invariants across multiple rows — bank-account-style transfers, on-call rotations, scheduling, double-entry bookkeeping. Wrap the call site in a retry loop for 40001 errors.
- Snapshot isolation is fine for analytics: long-running read transactions that need a consistent point-in-time view of the database.
Common production mistakes.
- Assuming the level name means the same thing across vendors. Migrating from MySQL Repeatable Read to Postgres Repeatable Read changes the semantics. Migrating to Postgres Serializable changes them again.
- Forgetting that SI permits write skew. If your code reads a set of rows and updates a row based on that read, SI can let two such transactions both commit. Check whether the invariant survives concurrent runs, not whether it survives in isolation.
- Long-running transactions under MVCC. A transaction that stays open for
hours holds back vacuum / cleanup, so old row versions pile up. The Postgres bloat
and the InnoDB undo-log explosion are the same disease: a forgotten
BEGINin a worker. - Counting on SERIALIZABLE in Oracle. Oracle's Serializable is snapshot isolation, not real serializability. Code that depends on serializability for correctness will silently break.
- No retry loop around Serializable. Aborts are part of the contract; if your app surfaces 40001 to the user, you've shipped a bug, not a feature.
Further reading.
- Berenson, Bernstein, Gray, Melton, O'Neil & O'Neil (1995) — A Critique of ANSI SQL Isolation Levels — the paper that named write skew and forced snapshot isolation onto the map.
- Semicolony — MVCC — how snapshot isolation actually works in the storage engine. Read this before SSI.
- Ports & Grittner (2012) — Serializable Snapshot Isolation in PostgreSQL — how Postgres implemented SSI, building on Cahill, Röhm & Fekete (2008).
- Bernstein, Hadzilacos & Goodman — Concurrency Control and Recovery in Database Systems — the canonical textbook. Chapters 2–4 are the theory you'll keep coming back to.
- MySQL — InnoDB transaction isolation levels — vendor doc for what InnoDB's Repeatable Read actually does, including next-key locking.
- Postgres — Transaction isolation — short, dense, and clear about the difference between Repeatable Read and SI.
- Jepsen — Consistency models — Kyle Kingsbury's map of every level from read-uncommitted to strict serializable, with the implication arrows drawn.