GPUs Go Brrr (via) Fascinating, detailed low-level notes on how to get the most out of NVIDIA's H100 GPUs (currently selling for around $40,000 a piece) from the research team at Stanford who created FlashAttention, among other things.
The swizzled memory layouts are flat-out incorrectly documented, which took considerable time for us to figure out.
Recent articles
- Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time - 9th January 2026
- LLM predictions for 2026, shared with Oxide and Friends - 8th January 2026
- Introducing gisthost.github.io - 1st January 2026