Virtual memory and the TLB.
Every load and store your program does goes through a translation. The address you wrote in the source code isn\'t the address RAM sees. The MMU does the lookup, caches it in the TLB, and falls back to walking a tree of page-table entries when the cache misses. When the page isn\'t in RAM at all, the OS jumps in, evicts somebody, reads from disk, and lets you try again. Four reads, one page fault — see the whole machinery move.
arr[0], arr[1], arr[1024], arr[2048] — 4 reads across 3 pages 4 KB pages · 8 virtual pages · 6 physical frames · LRU eviction · TLB has 4 slotsThe process has eight virtual pages numbered 0–7. Five of them currently live in physical RAM frames; the other three are on swap (slow disk). The page table on the right says where each one is. The TLB is empty — the CPU hasn't cached any translations yet. Page 5 in frame 0 was last touched a long time ago, marked dirty (modified). It's our future eviction target.
- Virtual address
- The address a program uses. Means nothing to RAM directly — the MMU translates it to a physical address using the page table.
- Page / frame
- A page is a fixed-size chunk of virtual address space (usually 4 KB). A frame is the matching chunk of physical RAM. Page table maps page → frame.
- TLB · Translation Lookaside Buffer
- A small (64-1024 entry) hardware cache of recent page → frame translations. A TLB hit is one cycle. A miss costs a page-table walk — tens of cycles minimum.
Why bother with virtual memory at all
Three reasons that compound. First, isolation: each process sees its own private address space, so a bug in one program can\'t scribble on another\'s memory. Second, simplicity: every process gets to pretend it owns memory starting at address zero, no matter what else is running. Third, oversubscription: virtual address space can be much bigger than physical RAM. You can mmap a 50 GB file on a laptop with 16 GB of RAM and the OS only loads the pages you actually touch.
The flip side is the page fault, where the abstraction leaks. A program with a bad working-set fit can crawl, and you won\'t see the cause anywhere in your code — it\'s the kernel paging stuff in and out. vmstat and top\'s "si/so" columns are the giveaway: high swap-in / swap-out means you\'re thrashing.
What a real x86-64 page table looks like
Not a flat array. With 48-bit virtual addresses and 4 KB pages you\'d need 2³⁶ entries — 512 GB of page table per process if everyone had one. So instead it\'s a four-level radix tree: PML4 → PDPT → PD → PT, 9 bits per level, indexed by chunks of the virtual address. The CPU register CR3 holds the physical address of the top-level table; a context switch updates CR3 to point at the next process\'s tree.
Each TLB miss walks four entries in this tree, reading from physical RAM each time. Without the TLB this would be unbearable. With it, the vast majority of memory accesses hit the TLB and the walk happens only on cold pages and on context switches. Newer CPUs also include a "paging structure cache" that caches the upper levels of the tree to make even the misses cheaper.
Tricks that virtual memory enables
- Copy-on-write (COW). When
fork()creates a child process, the kernel doesn\'t copy the parent\'s memory. It points the child\'s page table at the same physical frames and marks them read-only. The first time either side writes, the MMU faults, the kernel makes a copy, and execution continues. Fork is essentially free until somebody actually mutates something. - Demand paging. Allocation doesn\'t physically reserve RAM.
malloc(1 GB)just enlarges your virtual address space and updates the page table to say "these pages don\'t map to anything yet." You only pay for what you touch. - mmap. Maps a file into your address space. Reads happen via page faults that bring file blocks in via the page cache. Many databases (LMDB, the BoltDB family) lean on this — they let the kernel do the caching.
- Guard pages. One unmapped page sits between the stack and the heap. Run off the end of your stack and you get SIGSEGV instantly instead of silently scribbling on the heap.
- ASLR. Address Space Layout Randomization randomises where the stack, heap, and shared libraries land in virtual memory at process start. Makes return-into-libc attacks dramatically harder.
When the abstraction hurts
Tight inner loops over data that exceeds your working set will TLB-miss constantly. A 4 KB page covers 1024 ints; a working set of 16 GB needs 4 million page-table entries; a TLB only holds ~1500 of them. Solutions exist: huge pages (2 MB or 1 GB) cover 512x or 262144x more memory per TLB entry, at the cost of internal fragmentation. Databases and JVMs often opt into them on Linux via transparent_hugepage or explicit madvise(MADV_HUGEPAGE).
Memory-mapped files can also surprise you. A read that "looks like" a load instruction can actually trigger a disk read if the file page isn\'t resident. If you map a 100 GB file and iterate sequentially, your throughput is bounded by disk bandwidth, not memory bandwidth. mlock can pin pages in RAM if you really need them not to fault.
What this visualisation simplifies
- Flat page table. Real ones are multi-level radix trees as described above. We collapsed it into a single 8-entry table for clarity.
- Single-threaded. Multi-core systems have per-core TLBs that need invalidation messages (TLB shootdowns) when a page is unmapped or remapped. These can be the dominant cost in some workloads.
- No protection bits. Real PTEs also carry RWX bits, user/kernel, cacheability, accessed bits. We hid all of them.
- Pure LRU. Real eviction uses the clock algorithm (and variants like clock-pro) because true LRU needs hardware support nobody implements.
Operating Systems Codex →
Page table layouts on real hardware, the page cache, swap tuning, huge pages, mmap internals, NUMA effects on virtual memory.
Open the Codex →