Simon Willison’s Weblog

Subscribe

The One Billion Row Challenge in Go: from 1m45s to 4s in nine solutions (via) How fast can you read a billion semicolon delimited (name;float) lines and output a min/max/mean summary for each distinct name—13GB total?

Ben Hoyt describes his 9 incrementally improved versions written in Go in detail. The key optimizations involved custom hashmaps, optimized line parsing and splitting the work across multiple CPU cores.