Matrices as transformations

§1 Matrices as functions
A matrix is a linear function, not a 2D array. Its columns are the images of the standard basis vectors. The 2D array is just one way to write the function down.
§2 Matmul as composition; the three axes
Matrix multiplication is function composition. The matmul has three independent loop dimensions — M, N, K — and each tiles separately. That structural fact is what makes FlashAttention possible.
§3 Orthogonal & rotation matrices
Orthogonal matrices are the linear functions that preserve every length and every angle. Two structural invariants — ‖Qx‖ = ‖x‖ and ⟨Qx, Qy⟩ = ⟨x, y⟩ — make rotation-based quantization (TurboQuant) work.
§4 Tiled GEMM microkernel
Naive matmul is memory-bound, not compute-bound. The fix is tiling — keep a small working set in fast memory, reuse it before moving on. The microkernel pattern is the architectural ancestor of FlashAttention.