Simon Willison’s Weblog

Validation on the fly

Douglas Bowman’s weblog is making very interesting reading at the moment. Douglas is responsible for Wired’s exciting new design and since the launch has been updating with observations and lessons learnt from the new look. On Friday he described how changing a problem with a design element took less than 60 seconds (thanks to global CSS files), but the post that caught my attention was this one:

However, daily editorial additions continue to allow XHTML validation errors to sneak into the Wired News markup. The most frequent culprits are the ampersands (&) which separate name/value pairs in URL query strings, or which commonly appear in our English language in company names like AT&T or slang acronyms like R&D.

[snip]

Somehow, we have to avoid the constant manual check of pages and retroactive fixes of existing errors. This method is unreliable and time consuming. I’m sure the engineers will be making modifications to our content insertion tool, so that validation errors like naked ampersands can be automatically detected and corrected as they’re entered.

I had the exact same problem with this blog. My solution was to throw every entry through PHP’s XML parser when it is added—if the XML parser throws an error a warning message is displayed to encourage me to validate the page and re-check the entry. I imagine Wired’s content management system requires a slightly more elaborate solution than that but for my small scale needs it has been working a treat.

This is Validation on the fly by Simon Willison, posted on 21st October 2002.

Next: Blogrolled

Previous: Qube

Previously hosted at http://simon.incutio.com/archive/2002/10/21/validationOnTheFly