Simon Willison’s Weblog

W3C validator web service

Earlier today I mentioned how useful a web service interface to the new W3C validator would be. Tom Gilder pointed out in the comments that the validator now has an XML interface:

http://validator.w3.org:8001/check?uri=http://simon.incutio.com/&output=xml

I had a play around and the XML interface works pretty well, although it still has a few quirks (hardly surprising for a beta product)—the information on whether or not the page is valid is passed back in an HTTP header (X-W3X-Validator-Status) and if the page is unreachable or forbiden the interface returns an XHTML document. Still, it’s enough to play with, and as a demonstration of the flexibility of this new tool I’ve put together an XML-RPC proxy for the service:

Server : scripts.incutio.com
Port   : 80
Path   : /xmlrpc/validator/validate.php
Method : w3c.validate(url)

The web service accepts a URL and returns an XML-RPC struct containing the results of validation. The most important field of the struct is status, which will be set to Valid, Invalid or Failed depending on whether or not the page passed the test (Failed means the validator threw back an XHTML page rather than XML, and can generally be assumed to mean the page has failed validation for some reason). The other fields of the struct contain information returned by the validator, including an array of warnings and an array of error messages if any were returned. The structure of the struct can be best understood by comparing it to the XML returned by the standard XML interface.

The source code for the web service is available in the following files:

I’ve also coded in a 100 queries / IP / hour limit, in the unlikely event that the service gets a large amount of traffic. I should stress that this is a beta web service built on top of a beta validator—it may stop working at any time, so it should not be considered suitable for production use. If you want to use it heavily feel free to download the source and set it up on your server, but remember that the W3C beta validator may well change it’s XML output rendering the web service useless.

This is W3C validator web service by Simon Willison, posted on 28th October 2002.

Next: PHP at Yahoo

Previous: Apple Internet Developer

Previously hosted at http://simon.incutio.com/archive/2002/10/28/w3cValidatorWebService