Is there any reason to omit the /> except for some validator warnings?
People have been asking about this with WebHelpers, and the stock answer we've been giving is that for empty tags in HTML (like input) it's not considered an error (at least by the W3C validator, and as far as I know by anything that parses HTML). It's just a mild warning, and the simplicity of creating XHTML/HTML neutral tags keeps everything simpler.
Django is the framework for perfectionists with deadlines, so while no one would die if the tags were self-closing, I think Simon's approach is a nice way of giving people the choice to do pure HTML versus arbitrarily mixing HTML and XHTML.
Is this really an issue? Surely XHTML is the strict one where things _have_ to be XML and plain HTML is much more forgiving, so closed tags should be fine?
I always thought that trying to make things XHTML while still serving them up as HTML or transitional and text/html so that things won't completely crash if you have the tiniest validation error is a good middle ground for people that do care, but aren't too pedantic about the tiniest warnings.. (read: perfectionists with deadlines)
Btw.. I wonder what's happening with these things in HTML5?
I like the laws of interoperability. As I see it if it got applied here it would go something like: Strive for XML when sending, but try and handle small errors when receiving.
the stock answer we've been giving is that for empty tags in HTML (like input) it's not considered an error (at least by the W3C validator, and as far as I know by anything that parses HTML).
You're wrong. It's an error in HTML 4.01 and below (HTML 5 may change this, but it is in draft state at the moment). The validator can't always catch it, because the error doesn't always result in an invalid document, it just means something very different to what you intend.
For example, if you have <img src="..." alt="..." />, in HTML, that is equivalent to <img src="..." alt="..." >/>. Since that isn't what you intend, the code is incorrect. But because character data following an IMG element is not invalid, a validator won't flag it as an error. Do the same with an element that can't have character data following it (e.g. <meta>), and the validator can catch that, because although you are making the same mistake, it is unambiguously wrong, a syntax error rather than a logical error.
Think of it as akin to unintentional assignment in an if statement. In languages like C, if (foo = 1) is almost certainly an error. But a compiler won't usually flag it as an error because it's valid syntax. It just does the wrong thing. The same is true of XHTML-style empty elements in HTML - it's (usually) valid syntax, it just means the wrong thing.
In summary: it's incorrect code. The validator giving you a warning instead of an error does not change this and isn't a sign you should ignore it or tell people it's not wrong.
Jim: I don't think the validator (or any browser) is parsing it like you describe. The W3C validator gives a very specific warning, and does mention the problem you state (for HTML 4.01 Strict only), but says that it is not widely implemented that way. And certainly no browser interprets the markup like you describe.
So... maybe you are describing how a spec indicates this should be parsed, but you aren't speaking to how real and useful parsers actually work. (Which, I suppose, is one of the criticisms that led to HTML 5 -- the situation with the HTML 4 spec that is either buggy or misinformed or perhaps just misguided)
I just stumpled upon your project. I have case which I think it's not going to solve unless I'm mistaken.
I'm assembling my pages in smaller pieces. Instead of including all the stuff into one master piece, I render the smaller pieces (e.g. for the sidebar) one by one, and then insert the result into the page template.
This has turned out to the make the code much simpler since each template rendering only needs the context parameters for a small piece, instead of one huge render_to_response with parameters for say 3 completely different content boxes and 5 sidebar boxes.
But as far as I can tell, this is incompatible with your idea, unless I put the doctype tag into every template using a form.
I think it would be simpler with a global setting for this kind of stuff, something like HTML_OUTPUT="html4" or HTML_OUTPUT="xhtml".
This makes me wonder whether there are any good XHTML/HTML w3c-type validation middlewares out there for Django. Simon, do you use anything? Does the django_debug toolbar offer that?
I love it! Thanks Simon!
I opened a ticket after the original discussion and even coded the doctype part here:
http://code.djangoproject.com/ticket/7281
But didn't have the genius to do the field tag.
Is there any reason to omit the /> except for some validator warnings?
People have been asking about this with WebHelpers, and the stock answer we've been giving is that for empty tags in HTML (like input) it's not considered an error (at least by the W3C validator, and as far as I know by anything that parses HTML). It's just a mild warning, and the simplicity of creating XHTML/HTML neutral tags keeps everything simpler.
Ian Bicking - 10th September 2008 01:55 - #
Django is the framework for perfectionists with deadlines, so while no one would die if the tags were self-closing, I think Simon's approach is a nice way of giving people the choice to do pure HTML versus arbitrarily mixing HTML and XHTML.
huxley - 10th September 2008 02:09 - #
Is this really an issue? Surely XHTML is the strict one where things _have_ to be XML and plain HTML is much more forgiving, so closed tags should be fine?
I always thought that trying to make things XHTML while still serving them up as HTML or transitional and text/html so that things won't completely crash if you have the tiniest validation error is a good middle ground for people that do care, but aren't too pedantic about the tiniest warnings.. (read: perfectionists with deadlines)
Btw.. I wonder what's happening with these things in HTML5?
I like the laws of interoperability. As I see it if it got applied here it would go something like: Strive for XML when sending, but try and handle small errors when receiving.
Le Roux Bodenstein - 10th September 2008 11:04 - #
In HTML 5 the slash is optional (but only on the void elements; everywhere else it is an error). See the syntax section of the spec.
jgraham - 10th September 2008 11:15 - #
"Is this really an issue?"
It is for me. Self closing tags in HTML make me unhappy. Like huxley said, "perfectionists with deadlines".
I also remain entirely unconvinced that XHTML is a good idea for the Web at large, hence my desire to serve valid HTML 4.01 instead.
We have put in place a beta version of henri sivonen HTML 5 validator on W3C Web site. That should help to check the output of your module.
You're wrong. It's an error in HTML 4.01 and below (HTML 5 may change this, but it is in draft state at the moment). The validator can't always catch it, because the error doesn't always result in an invalid document, it just means something very different to what you intend.
For example, if you have
<img src="..." alt="..." />, in HTML, that is equivalent to<img src="..." alt="..." >/>. Since that isn't what you intend, the code is incorrect. But because character data following an IMG element is not invalid, a validator won't flag it as an error. Do the same with an element that can't have character data following it (e.g.<meta>), and the validator can catch that, because although you are making the same mistake, it is unambiguously wrong, a syntax error rather than a logical error.Think of it as akin to unintentional assignment in an
ifstatement. In languages like C,if (foo = 1)is almost certainly an error. But a compiler won't usually flag it as an error because it's valid syntax. It just does the wrong thing. The same is true of XHTML-style empty elements in HTML - it's (usually) valid syntax, it just means the wrong thing.In summary: it's incorrect code. The validator giving you a warning instead of an error does not change this and isn't a sign you should ignore it or tell people it's not wrong.
Jim - 11th September 2008 05:06 - #
Jim: I don't think the validator (or any browser) is parsing it like you describe. The W3C validator gives a very specific warning, and does mention the problem you state (for HTML 4.01 Strict only), but says that it is not widely implemented that way. And certainly no browser interprets the markup like you describe.
So... maybe you are describing how a spec indicates this should be parsed, but you aren't speaking to how real and useful parsers actually work. (Which, I suppose, is one of the criticisms that led to HTML 5 -- the situation with the HTML 4 spec that is either buggy or misinformed or perhaps just misguided)
Ian Bicking - 15th September 2008 20:49 - #
I just stumpled upon your project. I have case which I think it's not going to solve unless I'm mistaken.
I'm assembling my pages in smaller pieces. Instead of including all the stuff into one master piece, I render the smaller pieces (e.g. for the sidebar) one by one, and then insert the result into the page template.
This has turned out to the make the code much simpler since each template rendering only needs the context parameters for a small piece, instead of one huge render_to_response with parameters for say 3 completely different content boxes and 5 sidebar boxes.
But as far as I can tell, this is incompatible with your idea, unless I put the doctype tag into every template using a form.
I think it would be simpler with a global setting for this kind of stuff, something like HTML_OUTPUT="html4" or HTML_OUTPUT="xhtml".
Ole Laursen - 24th September 2008 14:20 - #
This makes me wonder whether there are any good XHTML/HTML w3c-type validation middlewares out there for Django. Simon, do you use anything? Does the django_debug toolbar offer that?
palewire - 5th October 2008 09:54 - #