W3C validator web service
28th October 2002
Earlier today I mentioned how useful a web service interface to the new W3C validator would be. Tom Gilder pointed out in the comments that the validator now has an XML interface:
http://validator.w3.org:8001/check?uri=http://simon.incutio.com/&output=xml
I had a play around and the XML interface works pretty well, although it still has a few quirks (hardly surprising for a beta product)—the information on whether or not the page is valid is passed back in an HTTP header (X-W3X-Validator-Status
) and if the page is unreachable or forbiden the interface returns an XHTML document. Still, it’s enough to play with, and as a demonstration of the flexibility of this new tool I’ve put together an XML-RPC proxy for the service:
Server : scripts.incutio.com Port : 80 Path : /xmlrpc/validator/validate.php Method : w3c.validate(url)
The web service accepts a URL and returns an XML-RPC struct containing the results of validation. The most important field of the struct is status
, which will be set to Valid, Invalid or Failed depending on whether or not the page passed the test (Failed means the validator threw back an XHTML page rather than XML, and can generally be assumed to mean the page has failed validation for some reason). The other fields of the struct contain information returned by the validator, including an array of warnings and an array of error messages if any were returned. The structure of the struct can be best understood by comparing it to the XML returned by the standard XML interface.
The source code for the web service is available in the following files:
- validate.php—the web service, implemented using IXR.
- SnoopyPlus.class.php—an extension of the Snoopy web client class
- classes.inc.php—various support classes, including the main XML parsing class
- W3cValidator.class.php—a class implementing the main logic of the web service (can be reused on its own)
I’ve also coded in a 100 queries / IP / hour limit, in the unlikely event that the service gets a large amount of traffic. I should stress that this is a beta web service built on top of a beta validator—it may stop working at any time, so it should not be considered suitable for production use. If you want to use it heavily feel free to download the source and set it up on your server, but remember that the W3C beta validator may well change it’s XML output rendering the web service useless.
More recent articles
- Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets - 23rd April 2024
- Options for accessing Llama 3 from the terminal using LLM - 22nd April 2024
- AI for Data Journalism: demonstrating what we can do with this stuff right now - 17th April 2024
- Three major LLM releases in 24 hours (plus weeknotes) - 10th April 2024
- Building files-to-prompt entirely using Claude 3 Opus - 8th April 2024
- Running OCR against PDFs and images directly in your browser - 30th March 2024
- llm cmd undo last git commit - a new plugin for LLM - 26th March 2024
- Building and testing C extensions for SQLite with ChatGPT Code Interpreter - 23rd March 2024
- Claude and ChatGPT for ad-hoc sidequests - 22nd March 2024
- Weeknotes: the aftermath of NICAR - 16th March 2024