llm.c (via) Andrej Karpathy implements LLM training—initially for GPT-2, other architectures to follow—in just over 1,000 lines of C on top of CUDA. Includes a tutorial about implementing LayerNorm by porting an implementation from Python.
Recent articles
- Claude Skills are awesome, maybe a bigger deal than MCP - 16th October 2025
- NVIDIA DGX Spark: great hardware, early days for the ecosystem - 14th October 2025
- Claude can write complete Datasette plugins now - 8th October 2025