V — The Systems That Run Them → Chapter 22
FROM SYSTEMS TO FRONTIER ML

The hardware substrate

GPU memory hierarchy (HBM↔SRAM, the FlashAttention motivation generalized), Tensor Cores, TPUs, Apple Silicon, the roofline model.

§1 GPU memory hierarchy + the roofline model §2 Tensor cores, fp8, NVLink §3 TPUs, Apple Silicon, AMD MI300X — the non-NVIDIA landscape

← ALL CHAPTERS