GPUs Go Brrr (via) Fascinating, detailed low-level notes on how to get the most out of NVIDIA's H100 GPUs (currently selling for around $40,000 a piece) from the research team at Stanford who created FlashAttention, among other things.
The swizzled memory layouts are flat-out incorrectly documented, which took considerable time for us to figure out.
Recent articles
- Maybe Meta's Llama claims to be open source because of the EU AI act - 19th April 2025
- Image segmentation using Gemini 2.5 - 18th April 2025
- GPT-4.1: Three new million token input models from OpenAI, including their cheapest model yet - 14th April 2025