Simon Willison’s Weblog

XHTML is still great for content

In repsonse to Mark Pilgrim’s Poisoning the envelope, Brian Donovan has expanded upon his opinion that long term web facing content should not be stored as (X)HTML:

Do everything “right” (proper DTD’s, validating all of your HTML, etc.) and, assuming that browser makers don’t chuck backwards compatibility (about as reasonable as deciding not to pay for medical insurance because of your past track record of good health) and you will still either (1.) be locked into circa 2002 XHTML forever / until you find yourself with an extra month or two (or more) of free time and get the itch to go through several years of accumulated content to bring it up to spec or (2.) find yourself building a patchwork site because you’ve been incorporating recent developments as they’ve come along (i.e. all of the entries after 2006 use XForms where appropriate after MSIE 9, Opera 11, Moz3/NS 10 support finally solidified, but earlier entries using plain old (X)HTML forms).

Patching your cms (or getting/paying a someone to do it for you) from time to time could be (by far) preferable to and cheaper than periodically hand-editing several years’ worth of articles stored in HTML format.

Again, I agree with Brian’s points with respect to HTML. His argument fails however when you consider XHTML. The beautiful thing about (valid) XHTML is that it can be processed by any tool capable of processing XML. No hand editing is required—if you later need to convert your content to a newer standard (and personally I see XHTML 1.0 as a pretty stable horse) it takes a simple XSLT stylesheet, or possibly a short Python script. You have created future proof content without having to reinvent the wheel.

This is XHTML is still great for content by Simon Willison, posted on 8th January 2003.

Next: Dorothea Salo on semantic HTML

Previous: Safari surprise

Previously hosted at http://simon.incutio.com/archive/2003/01/08/xhtmlIsStillGreatForContent