A quote from Andrej Karpathy

The most dramatic optimization to nanoGPT so far (~25% speedup) is to simply increase vocab size from 50257 to 50304 (nearest multiple of 64). This calculates added useless dimensions but goes down a different kernel path with much higher occupancy. Careful with your Powers of 2.

— Andrej Karpathy

Posted 4th February 2023 at 12:08 am

Simon Willison’s Weblog

Recent articles