A critique of ANSI SQL isolation, the levels nobody quite means.

The ANSI SQL standard names four isolation levels — READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE — and defines them by which "phenomena" they prevent. Berenson and Gray took the definitions apart in this 1995 paper and showed they're ambiguous, incomplete, and don't match what databases actually do. The paper also names snapshot isolation for the first time and shows it doesn't fit on the ANSI ladder.

Authors Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, Patrick O'Neil

Year 1995

Venue SIGMOD

PDF →

TL;DR

ANSI SQL-92 defined isolation in terms of three phenomena: dirty reads, non-repeatable reads, and phantom reads. The paper shows these definitions are interpreted differently across vendors, that the strict interpretation produces classes nobody implements, and that real-world isolation levels — especially Oracle's and SQL Server's — provide weaker guarantees than the strict reading of ANSI. The paper also introduces a richer phenomena taxonomy and defines snapshot isolation — a level used by Oracle and (later) PostgreSQL's MVCC that is strictly weaker than SERIALIZABLE but stronger than REPEATABLE READ for some phenomena. The paper's definitions are still the canonical vocabulary for talking about isolation, three decades later.

The problem

By the mid-1990s every major database vendor advertised "SERIALIZABLE isolation" but each meant something different by it. Oracle's default isolation level produced strange behaviour under certain workloads. SQL Server's "REPEATABLE READ" allowed phantom reads that the ANSI spec prohibited. Customers writing portable applications were surprised when the same SQL produced different results on different engines.

The ANSI SQL-92 specification was the source of the confusion. It defined the levels in terms of three "phenomena" — types of anomalies the level was supposed to prevent. But the prose was loose. Was a "non-repeatable read" defined in terms of a specific transaction's view, or any transaction's view? Was a "phantom read" only about INSERT, or also DELETE? Vendors interpreted these differently. The paper set out to clean up the vocabulary and show what was actually implementable.

The key idea

The paper introduces a precise history-based definition of isolation. A history is a sequence of read/write operations across multiple transactions. An isolation level is defined by which histories it permits. A phenomenon is a class of forbidden history pattern — e.g., P1 (Dirty Read) is "T1 reads a value written by T2 before T2 commits."

They show that ANSI's prose definitions correspond to broad interpretations of the phenomena that don't match real-world databases. Under the broad interpretation, Oracle's SERIALIZABLE is actually a weaker level (snapshot isolation), SQL Server's REPEATABLE READ is incorrect, and so on.

They then introduce extra phenomena (P0 Dirty Write, P4 Lost Update, P4C Cursor Lost Update, A5A Read Skew, A5B Write Skew) to characterise levels not in the ANSI ladder. The most important addition is snapshot isolation: each transaction reads as of a fixed snapshot, and writes are checked at commit for write-write conflicts. SI prevents lost updates and dirty reads but permits write skew.

Write skew, the SI hole. Imagine two doctors on call. A constraint says at least one must remain on call. Both run "if there's another on-call doctor, take myself off" in parallel. Under SERIALIZABLE, one would run first, see the other still on call, and remove themselves; the second would see no other on call and refuse. Under snapshot isolation, both run against the same snapshot, both see the other on call, both remove themselves — and the constraint is violated. This is write skew. It's the anomaly that distinguishes SI from true serializability, and the reason CockroachDB and FoundationDB took the trouble to implement true SERIALIZABLE rather than the cheaper SI.

Contributions

History-based definitions. Isolation levels defined by which read/write histories they permit, not by informal English. This is now the standard vocabulary for the area.

Naming snapshot isolation. The paper formally defines SI and shows it doesn't fit on the ANSI ladder. SI is what Oracle's "Read Committed" and "Serializable" actually provide, and what Postgres's MVCC engines (default isolation) provide.

Extended phenomena. The P0–P5 phenomena are now standard. Most modern database documentation enumerates which phenomena each level prevents, using terminology from this paper.

The exposure of vendor inconsistencies. The paper effectively named-and-shamed every major vendor for advertising stronger isolation than they delivered. The next 15 years of database research and product work were spent either tightening the definitions or providing explicit "SERIALIZABLE" levels (as opposed to the de-facto SI).

Criticisms and limitations

The paper's phenomena are still not exhaustive. Subsequent work (Adya 1999, Cerone et al 2015) shows there are more anomalies in non-strict-2PL implementations that don't fit the Berenson taxonomy.

Snapshot isolation's practical importance is downplayed in the paper. SI turned out to be much more useful than the paper anticipated — MVCC databases default to it, and the write-skew anomaly is rare in practice for many workloads. The paper treats SI as a curiosity rather than as the dominant isolation level it became.

The history-based framework is hard to reason about for practitioners. Application developers usually want to know "can this anomaly happen in my workload?" not "what histories does this level permit?" — a gap that "A Critique of Snapshot Isolation" (Cahill, Röhm, Fekete, SIGMOD 2008) and several Jepsen reports tried to fill.

Where it shows up today

PostgreSQL — uses MVCC and provides Read Committed (default) and Repeatable Read (which is actually SI). True SERIALIZABLE is opt-in via Serializable Snapshot Isolation (Cahill et al, 2008).

Oracle — its "SERIALIZABLE" is SI. Has been for thirty years. The paper is the reason customers understood the gap.

SQL Server — supports the full ANSI ladder, plus Snapshot Isolation since 2005. Microsoft's docs cite this paper.

CockroachDB, FoundationDB, Spanner — all chose to implement true SERIALIZABLE (strict serializability for the first two, external consistency for the third) precisely to avoid the write-skew hole the paper identified.

Every Jepsen report on isolation levels — Aphyr's tests use Berenson's phenomena as the vocabulary for what each database's "Serializable" actually provides.

Follow-up reading

Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions — Adya · 1999 · MIT PhD thesis. The follow-up that extends Berenson's framework with more anomalies.
Serializable Snapshot Isolation in PostgreSQL — Ports & Grittner · 2012 · VLDB. How Postgres made SI strictly serializable by detecting dangerous rw-antidependency cycles at commit time.
A Critique of Snapshot Isolation — Cahill, Röhm, Fekete · 2008 · SIGMOD. The follow-up paper that proposed SSI and characterised the anomaly that distinguishes SI from serializable.
ARIES — Mohan et al · 1992 · TODS. The recovery layer underneath. Concurrency control and recovery interact heavily. Annotated.
Building on Quicksand — Helland · 2009 · CIDR. Why nothing is exactly-once and how to design around it. Annotated.

More annotated papers

Back to the papers index

Foundational distributed-systems and database papers, read and annotated.

← All papers

Found this useful?