XHTML for future-proof content
Don Park questions the benefits of emitting XHTML. In one sense, Don is right; publishing a whole site using XHTML in this day and age brings very little benefit and can cause a great deal of grief. But just because XHTML doesn’t provide advantages when publishing whole sites does not mean it should be written off entirely. As I’ve said on this blog many times before, XHTML offers an excellent format for future-proofing site content, especially chunks of content kept in a database. Keith D. Robinson makes some excellent points along the same lines in his latest essay, Standards, Semantic Markup, Distributed Authorship and Knowledge Management:
XHTML is, at it’s most basic, much simpler and easier to learn that traditional HTML 4.0. With a simple style guide, standard markup and CSS styles you can accomplish almost all the formatting a content author would need, just by knowing a handful of markup tags. Instead of trusting the CMS to sort out code from Word, for example, you can hand a content owner a cheat sheet with the basic tags outlined and trust that they can code their own content. I mean, really, how hard is is to learn 10 or so tags? Team this technique with a tool like Contribute and you’ve got a nice, simple and cheap process that, while doesn’t store you content in a database, keeps it in a clean, standard form you can repurpose down the road.
As for ensuring entered XHTML is valid, I think this site’s comment system does a pretty good job of showing how that can be achieved with only a small amount of server side effort.
"XHTML is, at it's most basic, much simpler and easier to learn that traditional HTML 4.0."
I've gotta take issue with this. If you are using CSS in conjunction with HTML it is no more difficult than XHTML...the arguments for XHTML vs HTML are all so ridiculous to me...can't we all just admit that there really is no real world benefit in 2003 for using XHTML...and realistically there never really will be any *true* benefit to using it for the simple reason that authors will just be as sloppy with XHTML as they are with HTML because no browser-maker in their right mind will release a product that doesn't gobble-up all the existing and future tag soup on the web.
Standards are bullshit. XHTML is a crock. The W3C is irrelevant.
→ Mark Pilgrim
MikeyC - 3rd August 2003 22:35 - #
Simon Willison - 3rd August 2003 23:09 - #
"I think you've missed my point. Sure, XHTML provides no real benefit when it comes to serving content up to browsers"
You're right, I probably did miss your point, but if you are admitting that there is no real benefit in serving content up to browsers as "true" XHTML (eg: correct mime type)then why exactly do you do so?
MikeyC - 3rd August 2003 23:22 - #
It's a learning experiment. I think I've successfully demonstrated that with a bit of support from software it's possible to maintain a frequently updated XHTML site with a comment system that accepts input from users. Nothing makes you fix a validation error faster than a page refusing to render in your browser!
I was going to switch back to HTML 4 because of javascript problems with Firebird in application/xml+xhtml (most importantly the missing document.cookie) property but since that's been fixed in 0.6.1 I may well stick with XHTML for the moment, for no reason other than I've invested the effort now.
Simon Willison - 3rd August 2003 23:39 - #
There certainly are benefits to XHTML 1.1 when "serving content up to browsers". They come from the modularized nature of XHTML 1.1 (and the ability to embed XML content) I couldn't do my blog without it.
Generating you own (valid) XHTML content is neither easier nor harder than generating valid HTML 4. The vast majority of web authors can't seem to do either. But Simon does identify the interesting challenge in bullet-proofing "foreign" content -- user comments (and, in my case, trackbacks, syndicated RSS feeds and Technorati cosmos data).
Jacques Distler - 4th August 2003 01:05 - #
Well, I'm not anywhere near the stage of development of many people who work in this medium, and I just do sites to present material for groups with a point of view, but as a person who jumped on the bandwagon in the days of html 2 and came from a pre-press orientation, I appreciated html 3.2 and its expansion, but 4 just left me cold.
Now when I write in xhtml 1.0 strict, it forces me to think in planning documents structurally, rejecting presentation in my html, and I appreciate that, for all the repurposing capability.
CW Petersen - 5th August 2003 04:24 - #
"I appreciated html 3.2...but 4 just left me cold...xhtml 1.0 strict...forces me to think in planning documents structurally...and I appreciate that"
HTML 4 leaves you "cold" but XHTML 1.0 doesn't? Its the same exact thing except its been reformulated into XML from SGML...I don't follow...
MikeyC - 5th August 2003 05:44 - #
XHTML 1.0 Strict removes several of the cruftier aspects of HTML4. I presume that enforces a certain mental discipline on Mr. Petersen which would otherwise be lacking. He could, of course, eschew those elements and write the same document in HTML4.
Indeed, there are a few sites I've seen, which use XSLT to serve up either XHTML 1.0 or HTML4 version of the same page, depending on the User-Agent being used.
Of course, nifty technologies like XSLT are unavailable if you author your documents in HTML4. Which is why folks concerned with Accesibility issues (for instance) are pushing XHTML.
Jacques Distler - 5th August 2003 07:40 - #
I believe it is more a case that some of the user-agents are lagging behind with regards to correctly processing and serving XML and XHTML respectively.
As a consequence less authors are willing to use the eXtensible aspect of XHTML for fear of the page breaking in user "Y's" browser.
Robert Wellock - 5th August 2003 12:09 - #