Best Practices of Browser Tags and Information Structuring?

Forum Moderators: open

Message Too Old, No Replies

Best Practices of Browser Tags and Information Structuring?

JAB Creations

11:06 pm on Aug 20, 2005 (gmt 0)

Dealing with just the most basic HTML tags I'm curious how a well structured HTML document would look like. I'm open to all suggestions!

I'm interested in seeing what other tags would be considered good structure, including what search engines would consider good structure too!

<?xml ...
<!DOCTYPE ...
<html>
<head>
<title></title>
</head>
<body>
<h1></h1>
<p></p>
<h2></h2>
<p></p>
</body>
</html>

jessejump

12:12 am on Aug 21, 2005 (gmt 0)

It looks like HTML was created for academic, military type documents.
What web page resembles an h1, p p p p h2, p p p p list h3 p p p p p blockquote p p p p h3.?
I think HTML is being used and content is being re-written to optimize for search engines

tedster

12:30 am on Aug 21, 2005 (gmt 0)

What web page resembles...

Some of mine do, at least within the content div. The nav div is more like

And the header div may just be a set of images. I seldom have need for xml, so that's one place where my documents differ from JAB's outline above. 4.01 strict is all that most of my clients need.

...content is being re-written to optimize for search engines

That's even true for search WITHIN a company, not just the Yahoo's and Google's. With so much information all around us, knowledge management has become extremely important -- and that includes being able to search for AND FIND the particulars that aere important for you in the moment. There's nothing like relatively standardized structures to make your company's information (or your personal information) easy to dig up.

One of my clients has some important literature in the area of 100 to 200 years old. I can barely read those originals - it seems to me that even the flow of ideas was more chaotic back in those days.

[edited by: tedster at 12:32 am (utc) on Aug. 21, 2005]

encyclo

12:31 am on Aug 21, 2005 (gmt 0)

A well-structured HTML document is one which uses the appropriate semantically-rich elements to markup the contents of the page in the most appropriate way. So header elements and paragraphs, certainly, but also ordered, unordered and definition lists, tables (for tabular data), links (the heart of what a web page is), blockquotes, citations, addresses...

The whole panoply of available elements in HTML, in particular those retained in the strict DTDs, should be used where necessary.

Farix

12:38 am on Aug 21, 2005 (gmt 0)

Get rid of the <?xml> prolog because it will throw IE6 into quirks mode.

JAB Creations

2:00 am on Aug 21, 2005 (gmt 0)

I've decided to go over all of the HTML tags and see how best I can transition my work to include them as much as possible. Validation has really only works with the structure of the markup, not it's accuracy of representation. Back to HTML 101, something I skipped when I was "learning" back in the day with Frontpage 98. ;-)

PS I know about IE and XML declaration.

tedster

2:25 am on Aug 21, 2005 (gmt 0)

A great move, JAB. HTML is all about the document and its meaning - whereas wysiwyg editors like FrontPage, Dreamweaver, GoLive and so on, are all about how it looks.

I made a similar transition a while back and I find that the sites I've designed since then are much more effective. And to a large degree, that's because my emphasis has shifted from design to content. The "M" in HTML is the big deal - we START with a document -- that is, with content -- and then we "mark it up" for the web (thatr means for any number of potential user agents.)

I used to design print ads, so I was very focused on grabbing the eyeballs. That seemed especially important when you're buying 1/4 of a page that is otherwise filled with distractions. The thing I wasn't getting about the web is that once someone has your site on their monitor, then at least for the moment, you have no one competing for those eyeballs. So giving your visitors the content they came for, rather than some eye candy, makes a big difference in business success.

I eventually evolved a saying for myself: "slick ain't sticky". I even wrote a post about it in New To Web Development [webmasterworld.com], because I feel it is such a valuable paradigm shift. I really do wish I had started out thinking this way.

tedster

2:52 am on Aug 21, 2005 (gmt 0)

...including what search engines would consider good structure too!

One thing search engines thrive on is mark-up that is "well formed". Some people feel that this means mark-up that validates according to the W3C. Well, being valid is one sure way to guarantee that your mark-up is well-formed, but being well-formed can be a bit less rigorous than being valid (especially in HTML 4.01).

Search engines really don't care if your documents use some deprecated attribute, or even something that's proprietary, as long as the code itself is well formed and they can parse it without having to go into an error recovery routine that may or may not succeed. When error recovery fails, then a chunk of the content may well be skipped (been there, done that!)

So closing tags is extremely important - and doing this in the order they were opened is very helpful too. Spelling errors in tags are an awful mistake: <spam> or </spam> is an error I've made more than once, and it orphans the partner tag! It's a good practice, even in writing html, to use closing tags even where they are optional (li, p, td and so on)

Copy/paste errors that accidentally take out an angle bracket are much to easy too create, and this is exactly the kind of error I've made that caused search engines to miss a chunk of a page because the code isn't well formed.

As encyclo said, any valid tag that accurately conveys the semantic values of the document is a good practice. This means that marking up menu links with <li> is a great idea. Divs, even nested divs, convey the manner in which parts of a document relate to each other semantically -- whereas table cells often have a way of splitting related content into relatively dissociated parts of the html.

Lots more to say on this - but I hope I've given some sense of what I'm talking about.

willybfriendly

4:34 am on Aug 21, 2005 (gmt 0)

"slick ain't sticky"

Great saying, that is.

Have you found any effective ways to communicate this fact to the client? Drives me batty when they demand JS Dropdown menus, Flash slide shows (and lets not forget the little "this page has been visited XXX times), etc.

WBF

Chris_D

4:46 am on Aug 21, 2005 (gmt 0)

Have you found any effective ways to communicate this fact to the client?

Yep. Take them to Google, click on the 'cached' link, then click on the "Click here for the cached text only" link at the top of the page.

Then explain that in basic terms, that's basically what Google sees - no flash, no javascript, no pretty pictures.

If they are still happy with their site after that - ask them how much budget they have for PPC......

tedster

5:30 am on Aug 21, 2005 (gmt 0)

Turning down the contract can work wonders sometimes. I will not work on a losing strategy - it's bad for my reputation!

To get back to JAB's original question, I'd like to talk about information structuring and the H1 tag. Although there's no absolute or technical prohibition on using it more than once, semantically I think that it's an extremely peculiar idea -- you're saying "this one document is about two different topics".

When I feel the need for two H1 tags, that's often a sign to me that maybe this content should be more than one page. I don't think search engines in general deal well with pages that are both long AND mixed topic. I also don't think users deal with it very well either.

It might also mean I'm not seeing what the REAL h1 tag for the page should be -- and by wordsmithing my way to that I can make the whole page perform better.

collymellon

11:51 am on Aug 21, 2005 (gmt 0)

Very intresting words..

But using the H1 more than once looks like the road I may well take with my latest project; The home page has 3 columns, each column with a <H1> then content below.

This dosen't mean I have too much content or have an over crowded page, I simply want 3 headings on each column e.g Menu, Welcome, Latest news. What would you suggest the best about this?

Each one is the main title of the column, using H2 for column 2 wouldn't make sense

surfin2u

12:22 pm on Aug 21, 2005 (gmt 0)

A description can be helpful to search engines and to other programs that try to organize web pages:

WW uses a doctype:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

G has a content-type:

jbinbpt

1:24 pm on Aug 21, 2005 (gmt 0)

On the <h1> question, This is an interesting thread [webmasterworld.com] on it. Msg#10 by trillianjedi answers the question. One <h1> only.
For my thinking.. <h2> is ideal for multiple column headings.
jb

claus

1:48 pm on Aug 21, 2005 (gmt 0)

The "M" in HTML is the big deal - we START with a document -- that is, with content -- and then we "mark it up" for the web (thatr means for any number of potential user agents.)

I used to design print ads, so I was very focused on grabbing the eyeballs. That seemed especially important when you're buying 1/4 of a page that is otherwise filled with distractions. The thing I wasn't getting about the web is that once someone has your site on their monitor, then at least for the moment, you have no one competing for those eyeballs. So giving your visitors the content they came for, rather than some eye candy, makes a big difference in business success.

Just thought that one should be repeated. Class post :) :) :)

H1 is the headline, or title of your page. Titles and headlines are not always the same, but they are mostly very close.

If your webpage was a book, the <title> would be outside on the cover and the <h1> would be the same thing repeated inside the book (perhaps including a subtitle). Then, for each major section of the book you would have a <h2>, and for each chapter a <h3>, and so on...

On the lowest level, each page would have a page number. That would be your <div> tags. But even within the page you could have different kinds of content. That's what the <p>, <span>, <table>, <img> tags and so on is for.

--
It is often very helpful to think of each individual page, and the site as a whole, as being a book - for very large sites, think libraries in stead (imho, fwiw, etc.)

[edited by: claus at 1:58 pm (utc) on Aug. 21, 2005]

JAB Creations

1:57 pm on Aug 21, 2005 (gmt 0)

Some very good points Tedster, for a while I've been using header tags for the purpose of chaning colors though to a limited extent they did at lesat have some relation to the weight of what words should have weight over others but still very poor use and one reason I think my site has sunk in to the dark abyss. Using the header 1 tag only once sounds like a dam good idea that I'll be trying out on the next version of my site. Colly, I have enough pages where I think I could effectively use the header one tag once. I'll have to see how li tags could be used with menus as I like to have them positioned in order to have varios theme layouts like CSS Zen though I'm not quite up to par I'd say.

Oh yes, I'm cutting off a client today btw Tedster since you mentioned it. Lack of appreciation for the work and not paying me enough half of what was due is a big no no. I must admit their work was a complete turn around from what I had setup about half a year ago on my own site. That is what's great about this place, half a year and everything in perspective can change (for the better of course). Just a pain in the ass to rework everything around it, but an improvement is always worth it I think.

I am now using meta tags and have been on a certain part of my site for a little while now and I've noticed Google has taken advantage of them.

jbinbpt may be right about using the header 2 tags for multiple column headings as menu items wouldn't really represent the main idea of a page.

I've gone over various tags and have found out the difference between ol and ul as well what the heck dl tags are. There are some other tags that I've noticed even before the need to really sit down and look over HTML in depth. All great advice by everyone...

rjohara

2:02 am on Aug 22, 2005 (gmt 0)

It looks like HTML was created for academic, military type documents.
What web page resembles an h1, p p p p h2, p p p p list h3 p p p p p blockquote p p p p h3.?

Well, most of mine do. ;-)

Yes of course, HTML was invented as a way of marking up standard academic publications so they could be displayed on the Internet. What Berners-Lee did was specify a group of structural elements that you commonly find in academic papers, and he then created a stripped-down version of SGML (invented years before to handle the computerization of huge government documents, like airplane maintenance manuals) that most anyone could use. These basic elements are what have always been and still are used in most every academic journal. Over the years some of the inconsistencies (or constraints) in Berners-Lee's original element-set have come back to haunt us. There never should have been empty elements like <br> for example; the conceptually correct element would have been <line>, which may appear in some future version; but we all labor under the burden of history.

Anyone interested in getting a better handle on the foundations of HTML might want to spend some time studying the default stylesheet underlying most browsers [w3.org]. Every browser has its own stylesheet built in -- it's what you get when you don't apply any CSS at all to an HTML document. These default styles are largely inherited from the original Mosaic-generation browsers and give you a feeling for how HTML was originally conceived.

httpwebwitch

4:03 am on Aug 22, 2005 (gmt 0)

As a designer-turned-web developer, I rebelled against the strictness of semantic markup at first. In my defense, for many years the only way to achieve interesting & complex layouts was to abuse the <table> tag.

I gradually began embracing CSS to get rid of all my <font> tags. Then I started using it to align backgrounds and list bullets. I found that the more I embraced CSS as a layout tool, the more my HTML started to look like the ideal semantic M in HTML.

I'd teach any new HTML developer to start with purely semantic XHTML and do everything in CSS from the start. Why learn bad habits that I personally had to unlearn because I fought in the browser wars?

To achieve certain visual effects, I will wrap my content in a few extra <div> tags, but that's a forgivable diversion, isn't it?

Shamefully, I still prefer <b> to <strong>

incrediBILL

5:06 am on Aug 22, 2005 (gmt 0)

A great move, JAB. HTML is all about the document and its meaning - whereas wysiwyg editors like FrontPage, Dreamweaver, GoLive and so on, are all about how it looks.

Total nonsense, as the wysiwyg editors are all about creating the content and not messing with HTML. With your analogy I'd craft a letter or a fax cover in Rich Text Format or Adobe Postscript directly instead of using MS Word or Word Perfect.

When people get past the naive idea that hand crafting HTML makes a real difference opposed to using wysiwyg editors there would be a lot more content and a lot less debate over how to create it. The only problem I've ever encountered that required a hand edit in HTML was tweaking incompatibilities in broken browsers. This is rarely the case these days unless you're getting too complex on the bleeding edge of HMTL and then again your visitors usually could care less except in extreme cases where the technology either makes or breaks the site.

When I first got my hands on a computer in the 70s I coded programs directly in HEX, then Assembler, then C, C++, etc. and each step up the ladder I looked back in marvel at how I wasted my time doing it the hard way back in the day. True, the compilers added in a little more garbage here and there, but without that evolution software wouldn't be nearly as evolved as it is today. Now I sit and I marvel at why the people hand coding HTML are wasting time in the same way when what's really needed are bigger and better tools to crank out more content instead of spinning their wheels doing nerd work tweaking HTML in Notepad or some low level HTML tool.

For the most part nobody even cares as your visitors couldn't tell how you built the page as they are only concerned about whats on the page, can I find it in the search engine, can I read it when I get to the site - everything beyond that is academic.

For anyone that claims it gets better SERPs I can show you many thousands of pages ranking exceptionally well using FrontPage and Dreamweaver and those people churn out content non-stop instead of worrying about an extra tag that nobody cares about.

It's like owning a bulldozer but insisting on using a shovel because of the 'technique' when at the end of the day the guy using a bulldozer virtually always wins against the shovel.

[edited by: incrediBILL at 5:10 am (utc) on Aug. 22, 2005]

JAB Creations

5:07 am on Aug 22, 2005 (gmt 0)

I am sure there are many comments that could be made between <b> and <strong> alone. Those are the types of inquiries I will most likely make in the near future as I go over all the XHTML tags, many for the first time.

The problem with (x)HTML and learning for the first time is that it is the ultimate language for the internet and therefor so many other technologies ultimately depend upon it to properly deliver the product. You have to consider CSS, JavaScript, SEO, and so many other aspects that all the tutorials I have seem just seem to ignore the complexity involved which is a shame because I can say I have a true appreciation of good work (and I don't consider my work good work else I'd be answering a lot more threads ha!).

I first started with Frontpage 98 and though I abandoned it eventually I will admit it is how I started. Messing my code up it held me back and eventually I started to learn it on my own.

tedster

5:40 am on Aug 22, 2005 (gmt 0)

<b> is a strictly presentational instruction - it means render the text in bold type. <strong> is semantic - it means give this word or phrase a strong meaning compared to "regular" text.

As I see it, the best semantic use of these tags would keep all the bold rendering instructions in CSS and out of the HTML - so all you would see in HTML would be <strong> tags. This practice cleanly separates rendering from meaning, and its widespread adoption would make the <b> tag a dinosaur. Of course, given all the legacy code on the web, the <b> tag must still be obeyed by browsers.

Note that when you use <strong> tag, it's the equivalent of having an aural browser raise its voice for the entire section -- although none of the current aural browsers can actually afford do this today because tag use is so inconsistent.

A similar situation exists between <i> and <em>. I believe the intended exectution in an aural browser is that <strong> is louder and <em> is raised in pitch. But as I mentioned earlier -- aural browsers cannot currently afford to render instructions this way.

In real-world practice, I also still like <b> tags for some situations. One simple letter for the tag - very efficient, even if it is a slightly non-purist usage.

encyclo

12:27 pm on Aug 22, 2005 (gmt 0)

One interesting little tool to test your sites with is the W3C's Semantic data extractor:

[w3.org...]

Put your URL into the system, and if it doesn't fail (the service tends to be a bit buggy at times) then you will be given an outline of the semantic structure of the document as seen by an XSLT Java Servlet and a copy of HTML Tidy.

The good thing about this service is that it is giving a purely machine-read interpretation of a web page rather than a visual, human one. Semantics is all about underlying structural meaning with the HTML presenting your content in the most appropriate and easy to understand way possible.

Hava a look at the results for the w3.org home page [w3.org] - as you can see, it's not too bad, but not perfect either (the copyright section is outlined with "unknown titles").

On the

<b>

versus

<strong>

question, HTML has a lot of legacy baggage comprised of a large selection of non-semantic elements. Semantically speaking HTML is not very rich, so you usually have to make various compromises with your choice of markup. This excess baggage has at least the advantage of continuity, as backwards-compatibility ensuring the longevity of published documents.

rjohara

6:12 pm on Aug 22, 2005 (gmt 0)

Encyclo is right - the semantic data extractor is a very valuable tool, both for learning and for proofreading, and should be more widely known. It's one of my favorites.

Here's a way to make the web a better place: the folks working on Firefox/Mozilla should create a simple button on the browser toolbar that says "Show me an outline of this page." When you click it, the page collapses to just titles, headers, and the first line of each paragraph. That would let you get a quick view of a page's contents. (You can actually set up your own user stylesheet that will do something close to this: just set all elements except <H> to display:none and see what happens when you apply the stylesheet.)

Custodian

8:58 pm on Aug 22, 2005 (gmt 0)

This has been an interesting thread for me. I've not realy paid too much attention to the semantic value of the HTML tags - didn't think search engines did either.

I remember from a couple years back that the thing to do was to ensure that you had lots of h1 and h2 tags for the search engines.

It was to the point that some webmasters suggested putting key phrases in h1 and h2 tags and then using CSS to format it however you want.

Just Curious - How much do you use CSS to alter the default appearance of header tags?

Custodian

collymellon

9:24 pm on Aug 22, 2005 (gmt 0)

How much do you use CSS to alter the default appearance of header tags?

As much as I need for that particular page.. the HTML will remain the same <H1>Title</H1> but the CSS will hold a lot of properties e.g H1{color:#000;font-size:16px;text-decoration:underline;} etc..

Custodian

9:45 pm on Aug 22, 2005 (gmt 0)

After going back and reading a few of the older posts, I think I answered my own questions.

H1 tags can be formatted any way you want as CollyMellon has suggested.

The Key seems to be - h1 tags are the main tags and should be used once per page and at the top where it smeantically makes the most sense.

This has been an eye opening discussion, just not sure yet whether I'm gutsy enough yet to make content king and then mark it up.

Custodian

JayC

10:11 pm on Aug 22, 2005 (gmt 0)

>the W3C's Semantic data extractor:

> Put your URL into the system,

You can do that here:
[w3.org...]

JAB Creations

12:02 am on Aug 23, 2005 (gmt 0)

That Semantic data extractor ROCKS!

All pages I work on now have to pass the following validators...

1.) W3C Markup Validator
[validator.w3.org...]

2.) XACT Accessibilities Validator
[webxact.watchfire.com...]

3.) Semantic data extractor
[w3.org...]

The last of course is a manual validation done by the author and not by the site itself. Still I value these online tools the most.

The XACT validator really works better for HTML 4 as I REALLY do not like using adjective like tags in my markup and it complains about missing hieght and width tags. It also makes a lot of obnoxious warnings EVEN IF you have a completely empty div as the only thing in your body...but I have yet to find any OTHER validator in regards to accessibilities.

rjohara

12:24 am on Aug 23, 2005 (gmt 0)

Don't forget:

[jigsaw.w3.org...]

Custodian

12:24 am on Aug 23, 2005 (gmt 0)

Cool!
Thanks JAB

This 63 message thread spans 3 pages: 63