Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Fun with Unicode

Hixie has submerged himself in Unicode. Stuart muses that the reason Unicode is so (potentially) huge is a legacy of the Y2K problem. I prefer the explanation given in XML in a Nutshell (my current reading matter of choice for three-and-a-half-hour-train-journeys-from-hell):

Unicode can potentially hold more than a million characters, but no one is willing to say in public where they think most of the remaining million characters will come from. *

* Footnote: Privately, some developers are willing to admit that they’re preparing for the day when we’re part of a Galactic Federation of thousands of intelligent species

This is Fun with Unicode by Simon Willison, posted on 13th September 2002.

View blog reactions

Next: Mozilla web-sniffer

Previous: More thoughs on Flash editors

1 comment

  1. My PingBack client detects Stuart's link now. Embaressingly myauto detection routine was getting to his Content-Type header "Content-Type: text/html; charset=iso-8859-1" and deciding the document wasn't HTML :/ It knows how to cut off the charset bit now.

    Simon Willison - 13th September 2002 00:46 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2002/09/13/funWithUnicode

A django site