Practical Unicode, please!
13th October 2003
Joel Spolsky has joined Tim Bray in the quest to educate the masses as to the importance of Unicode. Dan Sugalski kicks in as well with What the heck is: A string, a lengthy essay about string handling and why it really is a lot more complicated than you think it is.
These should all be required reading for anyone involved in programming and web development. Unfortunately, they all lack one critical aspect: practical advice. Having read all three I feel like I could lecture for an hour on code points, glyphs, ASCII, byte-order and a whole bunch of other topics. When it comes to updating my blogging system to support comments written in Japanese I’m still almost as clueless as I was before I read any of the above.
Enough of the theory: the web needs practical advice on developing Unicode enabled web pages and web applications. Is it just a case of ensuring my text editor is “saving as Unicode”? What about storage—can I throw Unicode at MySQL and expect it to come out again? If I serve a page up with Japanese characters in it, what will my users have to do to be able to read them? It’s a big, confusing world out there.
More recent articles
- Datasette Enrichments: a new plugin framework for augmenting your data - 1st December 2023
- llamafile is the new best way to run a LLM on your own computer - 29th November 2023
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023
- Exploring GPTs: ChatGPT in a trench coat? - 15th November 2023
- Financial sustainability for open source projects at GitHub Universe - 10th November 2023
- ospeak: a CLI tool for speaking text in the terminal via OpenAI - 7th November 2023
- DALL-E 3, GPT4All, PMTiles, sqlite-migrate, datasette-edit-schema - 30th October 2023