Simon Willison’s Weblog

Reexamining Python 3 Text I/O. Python 3.1’s IO performance is a huge improvement over 3.0, but still considerably slower than 2.6. It turns out it’s all to do with Python 3’s unicode support: When you read a file in to a string, you’re asking Python to decode the bytes in to UTF-8 (the new default encoding) at the same time. If you open the file in binary mode Python 3 will read raw bytes in to a bytestring instead, avoiding the conversion overhead and performing only 4% slower than the equivalent code in Python 2.6.4.