July 2005
Recently, this site went through a validation iteration of the documents hosted here. Validation is done periodically (as in once or twice a year) to make sure as many documents conform to a W3C specification.
Many writers who use the online medium have, in the past at least, completely and totally (brutally) abused how markup works versus how different rendering engines handle it. The advent (unmitigated disaster?) of inline formatting was the first step in the wrong direction of using markup languages for a lot of reasons - but the number one reason that using inline formatting can be a problem is quite simple:
It breaks scalability
It really does not matter who you are, once you read that phrase, you know you are in for a world of hurt.
Not only do Cascading Style Sheets give the author an immense amount
of power over formatting controls, but they make scalability
simpler. In a programmers terms, they are the header file that goes
everywhere
(wouldn't we all love that? :)
Once CSS is incorporated and working, all inline formatting has to be
removed and that means all of it including body tag
data etc. Luckily, the site used CSS and was HTML 4 compliant already.
The first step that was taken here was to remove known unfriendly XHTML
objects - tables for starters. The site has few tables - for a reason -
they are not friendly not only to browsers but to validators as well.
The next one was to remove any of the old hr tags, not only
are they not needed at the site (scan the page, you will see why) but
they are totally illegal - even in transition DTDs.
Next came the real stuff that, well, just was not foreseen. Following is a quick list of violations found:
<br /> not <br>table, image and meta
tags with /> not just >img tag better have an alt field./>.So what does all of that mean? Basically you have to have closing
tags or - as in the case of br or img tags -
you have to either have a closing tag per the XML standard or the
XHTML /> which indicates that it is a short tag.
Strict can be achieved, however that was not the goal since there is a certain amount of breakage involved. For starters, XML strict will not allow the use of forms, only xforms, a technology that has not yet matured. Actually, it has been the experience of the author that using transitional is the only way to keep sanity unless one is dealing with a very limited amount of raw data types.
The site has problems with formatting. The best example is the
overuse and abuse of the br tag. That, in of itself, is
likely the worst offender. It also - still - uses tables (something
not allowed in the strict DTD) not too mention some hackney spacing
such as the old classic
<p> </p> - there are others to
be sure, but like the text says go not gently..
which is
probably your best bet too. In addition to all of that, a lot
of bold, italic, quotes and other hold overs (which will likely
be supported forever) are in these docs as well.
To say that the site's author was pushed to the edge of his patience
is like saying Darth Vader occasionally lost his temper
. The first
thing to be done is to start small, find the smallest corner of a site
and change the DTD to the following:
<!DOCTYPE html PUBLIC-//W3C//DTD XHTML 1.0 Transitional//ENhttp://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd> <html xmlns=http://www.w3.org/1999/xhtml>
Next, redo all short tags such as (but not limited to) the following:
br img hr input META
Close all other tags, each and every one - no exceptions.
Finally, pay a visit here: the W3 HTML Validator. Yes, copy and paste is your friend (thanks to X mouse properties...). If there is something in the validator output that does not make sense make sure to try out the advanced options (these should be selectable in the menu above the results if a page fails) such as showing the output. The validators line numbers and your file numbers will not match. So using their output generator will go a long way in tracking down the problem area.
The best, saved for last. Why bother? Whats the use? No one else does.
Yes they do, actually a great deal of web authors pride themselves
on their validation - although being validated
is not, in of itself
the makings of a good author. This site has neither the XHTML 1
or CSS images or links to go with it [1].
There are at least several good reasons to get validated other than saying
you are.
Believe it or not, a validated site is compatible with most web browsers - as shocking as that may sound. If CSS is being employed to its fullest, then the site - generally - looks the same or at least the layout is comparable. That is not to say that it is not a good idea to break CSS every now and again to see how the site renders. A real cheap way to break CSS is to use Firefox or Mozilla and disable stylesheets.
Clients that are not visually based, such as aural ones, tend to use standards to the hilt. By using standards you give the disabled an edge when they come to your site - yes actually there is more you can do such as aural stylesheets - but the least amount of work to make sure that your site works for the disabled is to comply with the lowest contemporary standard of the time.
Earlier in the article it was mentioned that the site still has a lot of problems, even within the scope of XHTML 1 Transitional. Faults most likely never would have bubbled to the surface unless a validation check had been run, in addition, several syntax errors were discovered.
Validation of anything, whether it be POSIX code or environmental
sensors, is an essential part of the computing and many others,
profession. Making web pages
validate with XHTML is a step
in the right direction for any online author. It makes the material
scalable, import-able and frankly - most useful.