HTML: Include, or exclude, optional closing tags?

I am adding some links here to help you with the history of HTML, for you to understand the various contradictions. This is not the answer to your question, but you will know more after reading these various digests.

  • How Did We Get Here? – Dive Into HTML5
  • The History of the Web
  • Brief History of HTML
  • HTML’s History – HTML WG Wiki

Some excerpts from Dive Into HTML5:

[T]he fact that “broken” HTML markup still worked in web browsers led authors to create broken HTML pages. A lot of broken pages. By some estimates, over 99% of HTML pages on the web today have at least one error in them. But because these errors don’t cause browsers to display visible error messages, nobody ever fixes them.

The W3C saw this as a fundamental problem with the web, and they set out to correct it. XML, published in 1997, broke from the tradition of forgiving clients and mandated that all programs that consumed XML must treat so-called “well-formedness” errors as fatal. This concept of failing on the first error became known as “draconian error handling,” after the Greek leader Draco who instituted the death penalty for relatively minor infractions of his laws. When the W3C reformulated HTML as an XML vocabulary, they mandated that all documents served with the new application/xhtml+xml MIME type would be subject to draconian error handling. If there was even a single well-formedness error in your XHTML page […] web browsers would have no choice but to stop processing and display an error message to the end user.

This idea was not universally popular. With an estimated error rate of 99% on existing pages, the ever-present possibility of displaying errors to the end user, and the dearth of new features in XHTML 1.0 and 1.1 to justify the cost, web authors basically ignored application/xhtml+xml. But that doesn’t mean they ignored XHTML altogether. Oh, most definitely not. Appendix C of the XHTML 1.0 specification gave the web authors of the world a loophole: “Use something that looks kind of like XHTML syntax, but keep serving it with the text/html MIME type.” And that’s exactly what thousands of web developers did: they “upgraded” to XHTML syntax but kept serving it with a text/html MIME type.

Even today, millions of web pages claim to be XHTML. They start with the XHTML doctype on the first line, use lowercase tag names, use quotes around attribute values, and add a trailing slash after empty elements like <br /> and <hr />. But only a tiny fraction of these pages are served with the application/xhtml+xml MIME type that would trigger XML’s draconian error handling. Any page served with a MIME type of text/html — regardless of doctype, syntax, or coding style — will be parsed using a “forgiving” HTML parser, silently ignoring any markup errors, and never alerting end users (or anyone else) even if the pages are technically broken.

XHTML 1.0 included this loophole, but XHTML 1.1 closed it, and the never-finalized XHTML 2.0 continued the tradition of requiring draconian error handling. And that’s why there are billions of pages that claim to be XHTML 1.0, and only a handful that claim to be XHTML 1.1 (or XHTML 2.0). So are you really using XHTML? Check your MIME type. (Actually, if you don’t know what MIME type you’re using, I can pretty much guarantee that you’re still using text/html.) Unless you’re serving your pages with a MIME type of application/xhtml+xml, your so-called “XHTML” is XML in name only.

[T]he people who had proposed evolving HTML and HTML forms were faced with two choices: give up, or continue their work outside of the W3C. They chose the latter, registered the whatwg.org domain, and in June 2004, the WHAT Working Group was born.

[T]he WHAT working group was quietly working on a few other things, too. One of them was a specification, initially dubbed Web Forms 2.0, which added new types of controls to HTML forms. (You’ll learn more about web forms in A Form of Madness.) Another was a draft specification called “Web Applications 1.0,” which included major new features like a direct-mode drawing canvas and native support for audio and video without plugins.

In October 2009, the W3C shut down the XHTML 2 Working Group and issued this statement to explain their decision:

When W3C announced the HTML and XHTML 2 Working Groups in March 2007, we indicated that we would continue to monitor the market for XHTML 2. W3C recognizes the importance of a clear signal to the community about the future of HTML.

While we recognize the value of the XHTML 2 Working Group’s contributions over the years, after discussion with the participants, W3C management has decided to allow the Working Group’s charter to expire at the end of 2009 and not to renew it.

The ones that win are the ones that ship.


The optional ones are all ones that should be semantically clear where they end, without needing the end tag. E.G. each <li> implies a </li> if there isn't one right before it.

The forbidden end tags all would be immediately followed by their end tag so it would be kind of redundant to have to type <img src="blah" alt="blah"></img> every time.

I almost always use the optional tags (unless I have a very good reason not to) because it lends to more readable and updateable code.


The reason i ask is because you just know it's optional for compatibility reasons; and they would have made them (mandatory | forbidden) if they could have.

That's an interesting inference. My reading of it is that just about any time a tag could be reliably inferred, the tag is optional. The design suggests that the intention was to make it quick and easy to write.

What did HTML 1, 2, 3 do with regards to these, now optional, closing tags.

The DTD for HTML 2 is embedded in the RFC which, along with the original HTML DTD, has optional start and end tags all over the place.

HTML 3 was abandoned (thanks to the browser wars) and replaced with HTML 3.2 (which was designed to describe the then current state of the web).

What does HTML 5 do?

HTML 5 was geared towards "paving the cowpaths" from the outset.

And what should i do?

Ah, now that is subjective and argumentative :)

Some people think that explicit tags are better for readability and maintainability by virtue of being in front of the readers eyes.

Some people think that inferred tags are better for readability and maintainability by virtue of not cluttering up the editor.


There are cases where explicit tags help, but sometimes it's needless pedantry.

Note that the HTML spec clearly specifies when it's valid to omit tags, so it's not always an error.

For example you never need </body></html>. Nobody ever remembers to put <tbody> explicitly (to the point that XHTML made exceptions for it).

You don't need </head><body> unless you have DOM-manipulating scripts that actually search <head> (then it's better to close it explicitly, because rules for implied end of <head> could surprise you).

Nested lists are actually better off without </li>, because then it's harder to create erroneous ul > ul tree.

Valid:

<ul>
  <li>item
  <ul>
    <li>item
  </ul>
</ul>

Invalid:

<ul>
  <li>item</li>
  <ul>
    <li>item</li>
  </ul>
</ul>

And keep in mind that end tags are implied whether you try to close all elements or not. Putting end tags won't automatically make parsing more robust:

<p>foo <p>bar</p> baz</p>

will parse as:

<p>foo</p><p>bar</p> baz

It can only help when you validate documents.

Tags:

Html

Html4