Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Show less errors

The W3C Validator team are seeking help with the latest version of their validator, dubbed the “Zeldman Made Us Do It!” release. They want people to play with the beta and submit suggestions for error messages that would make more sense to the average user. They also have a new feature called “fussy mode” which acts a bit like a lint tool for checking code, highlighting problems that aren’t necessarily illegal markup but may not be best practise techniques.

It’s great to see improvements to error messages being made (a classic example is the head-scratch-inducing “NET-enabling start-tag requires SHORTTAG YES”, which means you used <br /> or <img /> in a normal HTML document) but in my opinion the best thing the validator could possible do is display less errors. Let’s take CNN.com as a classic example of an invalid page. Feed it through the new validator and you get a list of 206 errors that scrolls for pages and pages. The average non-standards clued up web designer is going to take one look at that list and give up on the spot: the site works in all the browsers they have tested, and fixing 206 errors is just going to be a waste of their time. I can distinctly remember thinking that exact thing the first time I tried the validator, and consequentially ignoring it for well over a year afterwards.

Anyone who’s managed to fix up a page using the validator before will know that errors frequently cascade: one missing tag can cause a dozen or so related errors on the page, which all vanish when the initial missing tag is re-added. Further, a lot of errors boil down to exactly the same concept. If a designer has forgotten to escape the &s in the URLs on a page it could add a hundred or so extra errors to the validation results. They only need to be told once. If the validator came back with a condensed list of 6 or 7 errors along with human explanation and a note that the error occurred X times on the page it would be far less likely to send people recoiling in horror from information overload. Such a condensed report would not need to be the only interface to the validator, although I would recommend it as the default interface simply because advanced users can work out where the “verbose” option is themselves; it’s the newbies who need a helping hand and a condensed, easily understood report.

I submitted this suggestion to the validator mailing list a few days ago, but as I haven’t had any replies there I thought I’d throw it open to the blogging community to see what people think.

This is Show less errors by Simon Willison, posted on 2nd September 2003.

View blog reactions

Next: Blacklisting Comment Spam

Previous: Googling for fun and profit

14 comments

  1. Yea, I think it's really great that they're working to improve the validator. I just hope that they follow through with it. Something I'd like to see is a friendlier design. The ugly tech look always turns me away.

    Adam Polselli - 2nd September 2003 03:44 - #

  2. That would be "show fewer errors", but we're above such linguistic pedantry, I hope. :) I don't agree with the idea that you should collapse distinct errors; if I've not escaped some ampersands then I want to know about all of them and where they are, not just that there are some, because I need knowledge of where they are to fix them. What we really need is to stop a cascade of errors, but since every compiler and validator I've ever seen does that I assume it's not a solvable problem...

    sil - 2nd September 2003 08:00 - #

  3. Totally with you there, Simon. I have exactly the same feeling about Bobbby validation. In the team I work with, it's very difficult to get people to use vlidation services like the W3C's HTML and Bobby - when things are wrong it's a real double-whammy, and usually leaves people cold ("heck, I'm never going to do that [validate] again"). Anything that can be done to simplify the error reporting by default must be a good thing.

    Ian Lloyd - 2nd September 2003 09:10 - #

  4. In fact the problem of cascade of errors is surprising. Removing erronous text from the page being validated when an error is detected should solve the problem.

    The "show fewer errors" feature should have its own toggle and report the lines where the error occured.

    Mathieu 'P01' HENRI - 2nd September 2003 09:13 - #

  5. Excellent suggestion Simon. It has my vote.

    Lars Holst - 2nd September 2003 13:07 - #

  6. Stuart, wouldn't it be better to collapse similar errors together (to avoid scaring people off) and then provide the option to expand a particular error to provide all the locations and details?

    GaryF - 2nd September 2003 13:45 - #

  7. It's a step in the right direction giving more user-friendly error messages in simple terms; I've lost count of the amount of people whom have asked me for help with interpreting those error messages...

    When I began teaching myself HTML 4.01 in early 1999 I probably found some of the warning results rather similar to technical jargon, although it wasn't long until one became used to the W3C Markup Validator terminology.

    Robert Wellock - 2nd September 2003 14:30 - #

  8. Stuart: I agree, the validator should never stop reporting every error. I just think it should default to only showing a summary. If you look at my mockup it provides a 'more details' link next to each error summary. Even better, the current validator behaviour should be available for advanced users, preferably through a separate URL so they can bookmark it and skip the new users mode.

    Simon Willison - 2nd September 2003 15:38 - #

  9. I agree with you Simon. Like the other day a friend told me: What's up? Don't you practice what you preach regarding standards? There's 41 errors on your page! It was only a missing / in a closing tag. On the other hand, I've started serving my pages as application/xhtml+xml so this kind of mistakes will be immediately obvious :-)

    @sil: Couldn't we have a condensed list with links to the verbose listing? I think that would be the best solution.

    Ben de Groot - 2nd September 2003 16:26 - #

  10. I wonder if classifying the validation results, and organizing them by classification, might be helpful.

    As it is, results are organized by where the error falls in the document. This isn't a bad thing in a text editor, but it makes a lot less sense in a remote-validation environment.

    Obviously any classification we do is going to be problematic, but a broad brush to start: could we separate out character-encoding and entity problems from structural problems?

    Dorothea Salo - 2nd September 2003 19:36 - #

  11. I completely agree with your suggestion, Simon. I was reading this spot very carefully because we (czech bloggers) are discussing the same problem (multiple error messages for one mistake) here in Czech Republic. What's more - we have found several cases of pages which are valid (as HTML 4.01 Transition) in an official W3C HTML validator but not in the new beta version. Anyway - it would be nice to implement your suggestion but I'm not sure that it could be done easily, because we are speaking about 'understanding' the meaning of a code.

    Lukas Oborsky - 2nd September 2003 20:29 - #

  12. I completely agree. I went to huge trouble to make sure my page validated (I was being very good!) and then I installed a news script that someone else had written. Oh dear. Next time I think I'll stick to my own, because it introduced absolutely loads of errors, mostly to do with the ampersands you mentioned, and I really couldn't be bothered to fix it. Next time I'll use a good news script :P *groans* *is a little bit scared by the people who post here* :) Hope the placement is going well.

    Laura - 2nd September 2003 22:54 - #

  13. Simon,

    Ah, right, I get it, from looking at the mockup. You'd have to have all the stuff in the page anyway, and just display:none it, because you couldn't have "all TD errors" on a separate page without session IDs, but I see your plan.

    One minor quibble, which is not your fault at all but a moan about the validator: it should not say "I found a load of unknown &foo entities, which are probably unescaped ampersands in URLs", it should bloody well look and see if they are in a URL and then tell you that. I hate it when the machine says "This is probably because of X" when it could look and work out for itself whether it's because of X and then give me a properly useful error report.

    The validator code is downloadable, but in Perl, so my plan to hack it and do all this has gone by the wayside a bit because my Perl's rubbish.

    sil - 3rd September 2003 02:37 - #

  14. The Feed Validator does something like this, although I'm not sure it's a completely analogous situation. Feeds are generally template-based and generally contain the same structure repeated across many items, so errors almost *never* occur alone. But anyway, if you have 10 dates and they are all in an invalid format, the Feed Validator will just say "element 'dc:date' is an invalid date (10 occurrences)". Each of the occurrences is highlighted in the source code below though.

    Again, not completely analogous because errors in feeds tend to be independent, not cascading. It's XML, so errors like unmatched tags or unescaped ampersands just throw a SAXError and we just give up altogether. I'm not sure I like the idea of collapsing cascading errors, but I'd certainly like to see a prototype so I could think about it some more.

    Mark - 5th September 2003 04:40 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/09/02/showLessErrors

A django site