Simon Willison’s Weblog

Fixed validation again

The road to validity is frought with peril. I’ve just fixed another small group of errors that were preventing this page from validating (after spotting the ominous W3C validator in today’s user-agent logs). This time is was a couple of forgotten </p> tags and an unescaped ampersand.

There has to be a technological way of helping avoid these errors. Originally I wanted to be able to edit my entries in some kind of specialised markup language (such as WikiText or UBBCode) that the blogging sofftware could convert into valid XHTML, but I quickly realised that the most flexible markup language for blog entries is XHTML itself thanks to the built in support for everything from quotes to lists and code samples.

Thinking about it, almost all of the common errors I am experiencing come from the XML parser rather than the rules governing XHTML. I need an XML parser that examines each post as (or before) it is added to the blog and checks for well-formedness. Expat (used in PHP for event based XML parsing) does not validate documents against a DTD but it DOES die with an error if an XML document is malformed. It looks like it could be just what I need.

The ideal alternative would be for the W3C to create a web service back end for their validator so blogging software can check the validity of new entries automatically.

This is Fixed validation again by Simon Willison, posted on 16th June 2002.

Next: My first XHTML mind bomb

Previous: Meg replies

Previously hosted at http://simon.incutio.com/archive/2002/06/16/fixedValidationAgain