The most dramatic optimization to nanoGPT so far (~25% speedup) is to simply increase vocab size from 50257 to 50304 (nearest multiple of 64). This calculates added useless dimensions but goes down a different kernel path with much higher occupancy. Careful with your Powers of 2.
Recent articles
- OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI - 12th December 2025
- GPT-5.2 - 11th December 2025
- Useful patterns for building HTML tools - 10th December 2025