Log in   Register a New Account

Accessify Forum - Discuss Website Accessibility

New to the forum?

Only an email address is required.

Register Here

Already registered? Log In

Currently Online

No registered users are online.

Emil Stenström: Why XHTML is a bad idea

Reply with quote
Torsten wrote:
Cerbera wrote:
The additional strictness moves the burden of error correction to every author using markup on the planet, making the web impossible for people without an unhealthy amount of markup knowledge to participate.
I don't agree. The difference between HTML and XML compliant XHTML is not a huge barrier in my opinion. What is a major pain is content negotiation, especially when there are no discernible benefits to the individual producing the content. Blame Microsoft, not XHTML.
I've often had to help friends whose blogs have become scrabled because they missed an end tag, or copy-pasted too many end tags. Because their pages were being sent as text/html their readers can at least still see the content; in an XML web all they'd get would be an error message.

XML would be a huge barrier for most people publishing on the web because well-formedness and other XML requirements are not intuitive concepts for Joe Public.

Torsten wrote:
Cerbera wrote:
It's much more practical for errors to handled by a few well-funded browser manufacturers, especially since they already have sophisticated error handling.
Why not move this burden on to the authoring tools? That seems to me a much more logical approach and, I would've thought, a great deal easier.
A lot of pages are authored through a variety of homebrew systems, open-source news scripts and custom CMSs. Moving all error correction to each of the small teams and individuals for these projects wouldn't work. They don't have the time or expertise to develop the sophisticated error handling to sanitise erroneous user input with the 100% reliability an XML web would require.

In contrast, there are very few browser manufacturers and they already have the expertise of developing sophisticated error handling. They also have the time (and funding) to continue this. It makes sense to keep this most difficult aspect where the talent and resources for solving it already exist?

Better compliance from the major commercial authoring tools would be a good thing as well, especially those made by companies who also make browsers (such as Microsoft).

Torsten wrote:
Cerbera wrote:
Interestingly, it's the lack of standards compliance in CSS which causes almost all interoperability problems. It's rare to find a website break because of its markup, unless you use a 20th century mobile phone.
Could you point me in the direction of any supporting research, or is this purely anecdotal?
It's anecdotal, but widely reported throughout the web design community. You've probably experienced it yourself. The only significant HTML interoperability problem I know of is the lack of <abbr> support in IE, but the CSS interoperability problems between are numerous (from what I've experienced and seen reported).

Torsten wrote:
I know next to nothing about HTML parsers, but it seems to me that you can't completely separate the application of style rules from parsing the underlying mark-up i.e. broken mark-up makes for broken CSS.
The two are tied together, you're right, but tied loosely. HTML browsers apply styling to an error-corrected version of the document. That softens the blow of erroneous markup, sometimes making its effect imperceivable.

Remember that tidy, browser-friendly markup can be written in HTML. It ain't got to be soup!

Torsten wrote:
It's always been my assumption that XHTML, in addition to allowing for greater efficiency and accuracy in parsing (with all the benefits that this entails), would also make the application of CSS less error prone. I'd welcome any corrections on that score from those in the know.
XML compliant XHTML documents will almost always be at least a little larger than the equivalent HTML due to the more verbose syntax. The time lost in data transfer is likely to be greater than that gained through the marginally faster parsing it allows, resulting in an overall performance loss. (Data transfer rates around the internals of a PC are very much faster than those across the web.)

Last edited by Ben Millard on 08 Sep 2006 02:13 pm; edited 1 time in total
Reply with quote
Cerbera wrote:
The only significant HTML interoperability problem I know of is the lack of <abbr> support in IE, but the CSS interoperability problems between are numerous (from what I've experienced and seen reported).
I'd say lack of proper <button> support in IE is much more significant, and if we take parser issues into account there are lots of interop problems in text/html UAs (probably mostly because HTML never defined error handling).

Simon Pieters
Reply with quote
Cerbera wrote:
I've often had to help friends whose blogs have become scrabled because they missed an end tag, or copy-pasted too many end tags. Because their pages were being sent as text/html their readers can at least still see the content; in an XML web all they'd get would be an error message.


...telling them exactly what the problem is. So they have to add or remove one or two closing tags, is that really an enormous hurdle? I don't think so.

Cerbera wrote:
XML would be a huge barrier for most people publishing on the web because well-formedness and other XML requirements are not intuitive concepts for Joe Public.


I simply do not agree. To the contrary, having a strictly defined and simple set of rules (far simpler than those governing SGML and by extension HTML), should create a stronger conceptual model. There's nothing inherently complicated about the concept of 'well formed-ness'. It really couldn't be simpler.

Cerbera wrote:
A lot of pages are authored through a variety of homebrew systems, open-source news scripts and custom CMSs. Moving all error correction to each of the small teams and individuals for these projects wouldn't work. They don't have the time or expertise to develop the sophisticated error handling to sanitise erroneous user input with the 100% reliability an XML web would require.


You misunderstand me. I wasn't talking about error correction, I meant that authoring tools should create valid mark-up in the first place. If you're referring to authoring tools that allow authors to manipulate the mark-up itself, then the simple answer is for them not to accept malformed mark-up, and thereby force the authors to correct their errors. Again, I don't see this as a major hurdle for content authors. Anybody who's prepared to author their mark-up 'by hand' is, it seems to me, unlikely to have any difficulty grasping the requirements of XML.

Cerbera wrote:
The two are tied together, you're right, but tied loosely. HTML browsers apply styling to an error-corrected version of the document. That softens the blow of erroneous markup, sometimes making its effect imperceivable.


(my emphasis)

That's precisely my point. Two different browsers, by virtue of their independent error detection/correction heuristics, may ultimately be looking at two different documents. These differences may often be small, but they may also be significant. As an advocate of the importance of semantics, I'm surprised that you're happy with the idea of User Agents having to guess what a content author intended to convey. XHTML resolves these ambiguities.

Cerbera wrote:
Remember that tidy, browser-friendly markup can be written in HTML. It ain't got to be soup!


If you're in favour of promoting 'well formed-ness' in HTML, why not go that little bit further and make it a requirement? To my knowledge nobody, myself included, is suggesting that such a requirement be enforced tomorrow. I think everybody recognises that it's likely to be a slow and gradual process, but that doesn't mean we shouldn't strive to attain it.

Cerbera wrote:
XML compliant XHTML documents will almost always be at least a little larger than the equivalent HTML due to the more verbose syntax. The time lost in data transfer is likely to be greater than that gained through the marginally faster parsing it allows, resulting in an overall performance loss. (Data transfer rates around the internals of a PC are very much faster than those across the web.)


I said “efficiency and accuracy”, not speed. We're not just talking about PCs here, but potentially a wide range of resource limited devices. Implementing a complex HTML parser, with all the error detection and correction that it would require, may not even be possible on such devices.

There's simply no mileage in the 'larger file sizes' argument as far as I'm concerned. With one or two notable exceptions (tables for example), the difference amounts to nothing. Furthermore, the level of knowledge required to actually benefit from HTML's comparatively slack rules is surely a far greater cognitive challenge than required by XML.
Reply with quote Turn the clock forwards - maybe 30 years - a pure XML web is not inconceivable as a long term ambition.

But probably these details are irrelevant, and what will ultimately emerge will be based on something entirely new that we haven't thought of yet. But the idea that underlies it all is to try to create a semantic web - a web of structured, semantic content, rather than a chaotic mash-up of unstructured noise.

Maybe it will turn out that the w3c becomes irrelevant in this process - maybe microformats shows a way for us to do it ourselves by the back door Wink

Display posts from previous:   

Page 2 of 2

Goto page Previous  1, 2

All times are GMT

  • Reply to topic
  • Post new topic