How QR codes work.
Two characters in, 441 modules out. A QR code is one of those things that looks like noise but follows strict rules — finders in three corners, a timing line connecting them, a snake of data bits filling what\'s left, then a mask flipping half of them to avoid awkward runs. Watch "HI" become a grid, scratch 12 modules off, and watch Reed-Solomon put them back.
"HI" → Version-1 QR, level L · 19 data + 7 ECC codewords 21 × 21 = 441 modules · 208 free for data/ECC · scratch 12 → recoverWe'll encode the two characters H and I into a Version-1 QR code — the smallest standard size, 21 × 21 modules. The first step is just to know what we're encoding. A real scanner will recover this string from a printed grid, even if part of the grid is dirty or scratched.
- QR code
- A 2D barcode standardised in 1994. Version 1 is 21×21 modules; versions go up to 177×177 (Version 40). Stores up to ~3 KB of data with error correction.
- Module
- One cell in the grid. Either black (1) or white (0). The smallest unit of information.
Why the three corners look weird
The 1:1:3:1:1 ratio of black-white-black-white-black in the finder patterns is deliberately rare. If you scan a single line across one, you get those specific run lengths. The scanner sweeps the image looking for exactly that pattern in any orientation; when it finds three of them in roughly an L shape, it knows where the code is and which way is up. The reason QR codes work upside down is that those three corners pin the orientation.
Why Reed-Solomon and not just a checksum
A checksum tells you something is wrong; it can\'t tell you what. Reed-Solomon, with the right amount of redundancy, both detects and locates errors and then corrects them. That\'s why a torn corner doesn\'t kill a QR code: the algorithm reads the surviving 196 modules, notices 12 don\'t agree with the polynomial check, computes where they went wrong, and fixes them.
The same code family powers CDs (handling scratches), DVDs, Blu-ray, satellite communications, and DNA storage prototypes. The structure stays the same; only the field sizes and ECC fractions change.
What the mask is actually doing
Without a mask, certain payloads would produce QR codes with big runs of one colour, or even patterns resembling the finder bullseyes elsewhere in the grid — and the scanner would get confused. The eight standard masks are simple deterministic functions (e.g. (row + col) % 2 == 0) that XOR with the data. The encoder generates all eight, scores each on four penalty criteria (run length, 2×2 blocks, finder-like sequences, light/dark ratio), and picks the lowest score. The mask choice goes into the format info so the decoder knows which one to undo.
What this visualisation simplifies
- Real bit values. Our data bits are pseudo-random for visual variety. The real encoded payload for "HI" is 0010 + 000000010 + 01100001111 + terminator + pad bytes. The visual structure is faithful.
- Format info bits. We grey-fill the format region instead of computing the real 15-bit BCH-encoded format word. Real format = ECC level (2 bits) + mask number (3 bits) + 10 BCH bits, then XORed with a fixed mask 101010000010010.
- Alignment patterns. V1 is the only version without them. From V2 up there are smaller 5×5 alignment patterns scattered to help scanners cope with curvature.
- Mask selection. We applied mask 0 directly. A real encoder generates all 8 masks, scores them, and picks the best.
- Reed-Solomon details. We say "decoder fixes 12 errors" without going through syndrome computation, error-locator polynomial, Chien search, and Forney\'s algorithm. The full math is its own page.
Why QR codes spread when others didn\'t
Datamatrix, PDF417, Aztec, MaxiCode — there are dozens of 2D barcode formats. QR won because (a) Denso Wave didn\'t enforce its patent, (b) the finder pattern is fast enough to detect on cheap CPUs, (c) error correction means smudged prints still scan, and (d) phone cameras + on-device decoders made the read free. The combination of "easy to print, hard to ruin, free to scan" beat every technically nicer alternative.
Codes, coding, and error correction →
The Codex covers Hamming codes, CRCs, parity, and the family of forward-error-correction schemes that keep your data intact across hostile mediums — wireless, disk, optical, network.
Open the Codex