15 sub-pages
Computer architecture / Internals

Internals

Fifteen deep dives, one per layer of the bare metal. Each page is anchored to real microarchitectures — Apple M4, Intel Raptor Lake, AMD Zen 5, RISC-V SiFive U74 — with real cycle counts, named systems, and the cache-and-coherence footnotes that turn textbook diagrams into production performance work.

All 15 live · 9-page Wave 1+2 + 6-page peripherals batch.


01
Live

Transistors, gates, and the ALU

NMOS to NAND to flip-flop to adder. The abstraction stack the rest of the path stands on.

Read
02
Live

Clocks, latches, flip-flops

Synchronous logic, setup and hold, why frequency stalled near 5 GHz.

Read
03
Live

The instruction cycle

Fetch, decode, execute, writeback. A RISC-V addi traced through the datapath.

Read
04
Live

Pipelining

The five-stage pipeline, and what changes when your laptop runs 14–20 stages.

Read
05
Live

Branch prediction and speculation

Two-bit counters to TAGE, the 20-cycle mispredict penalty, Spectre as the cost.

Read
06
Live

Out-of-order execution

Tomasulo, the reorder buffer, register renaming. ~512-entry ROB on Apple M4.

Read
07
Live

SIMD and vector throughput

SSE, AVX, AVX-512, AMX, SVE. License-down on Skylake-X. Auto-vec versus intrinsics.

Read
08
Live

The memory hierarchy

Eight orders of magnitude. Norvig’s table, modernised. Bandwidth versus latency.

Read
09
Live

Caches and MESI

Direct-mapped to set-associative, replacement, write-back, MESI/MOESI/MESIF, false sharing.

Read
10
Live

Virtual memory and the TLB

Every load is two loads. Page walks, TLB shootdowns, huge pages, Meltdown.

Read
11
Live

NUMA and memory bandwidth

Same-socket 80 ns, far-corner 400 ns. DDR5 channels, Linux NUMA balancing.

Read
12
Live

PCIe, DMA, interrupts

Lanes, generations, root complex. MSI/MSI-X. IOMMUs and GPU passthrough.

Read
13
Live

SSDs and the FTL

NAND cells, pages versus erase blocks, the Flash Translation Layer, NVMe queues.

Read
14
Live

GPUs and accelerators

SIMT versus SIMD, warps, HBM. TPUs and AMX as the post-Moore turn.

Read
15
Live

Power-on, firmware, boot

Reset to UEFI to bootloader to kernel. Microcode. Secure Boot, TPM, Intel ME.

Read

Reading order

You can take these in any order — each page stands on its own — but the sequence that builds the most intuition is the one that follows the abstraction stack from bottom to top: gates, then state, then the instruction cycle, then the optimisations (pipelining, branch prediction, OOO, SIMD), then the memory hierarchy and its coherence and translation layers, then the peripherals (PCIe, SSD, GPU), then how the whole thing turns on (boot).

If you're here for one specific topic — almost everyone is, the first time — start with that page. The cross-links inside each page handle the rest.

Canonical sources this directory leans on

  • Patterson & Hennessy — Computer Organization and Design (RISC-V Edition). The textbook spine. Concepts pinned to RISC-V so abstractions map to real instructions.
  • Hennessy & Patterson — Computer Architecture: A Quantitative Approach. The graduate sequel, heavier on coherence, NUMA, and the post-Moore turn.
  • Bryant & O'Hallaron — Computer Systems: A Programmer's Perspective. The most engineer-facing of the textbooks; the reference for how this study path is voiced.
  • Sorin, Hill & Wood — A Primer on Memory Consistency and Cache Coherence. Free, definitive, the textbook on coherence protocols.
  • Agner Fog's microarchitecture, optimization, and instruction tables. Free, exhaustive, the practitioner's manual for actual chip behaviour.
  • Chips and Cheese, Real World Tech, WikiChip. Where the latency and bandwidth numbers in these pages get cross-checked against fresh measurements.