Forum Moderators: open
Not sure what google's story is in this regard, but here we go...
In February 2003 completely refurbished my site using CSS and making it XHTML 1.0 compliant. The pages validate against W3 and other tools. Pages look good using most current browsers. So far so good.
Did a search on google. Site is listed but full of stuff that bears no relation to what the my site is about.
Did a check using Sim Spider. Guess what? No meta description and no meta keywords. Fiddled around and removed the closing forward slash. Retested and hey, there's the description and all the keywords.
So, we seem to be drawing the same conclusion as many others here have - google cannot read XHTML compliant web pages.
There's more. As mentioned, completely refurbished the site, moved stuff around, added more pages, etc. 4 months down the road, the previous site's description is still being used - and yes, we have the June 15 date stamp. No big deal there.
However, what's worse is that all the search result's content now relates to items in various CSS tags - section titles, navigation markers and copyright notices. Click on "similar pages" and what do I get there? My notice to users using non-compliant and early generation browsers.
In fact this notice appears in the other search engine results as well - being the first bit of readable copy on each web page. Not visible in compliant browsers but is there for others.
To cap this all, ALLTHEWEB has something similar but at least they have the CORRECT DESCRIPTION of each of the pages listed - this obtained from the meta tag description. Go figure.
All this may sound like a whinge. However, I now seem to be stuck between back peddling as in going back to HTML 4.01, corrupting my XHTML compliance or coming up with some sort of "exotic" fiddle to try and fix up this mess. The issue here isn't PR as such but rather, an accurate representation of what I have out there on the Internet.
While there is, and has been some inconsistancies regading spider parsing <meta name="description" content="some description" />, Google is showing improvement. ATW is spot on, MSN too...
Side note: MSN seems to be on the move. I'm noticing a lot more traffic AND quick indexing. Anyone else?
HTML4.0: meta: Start tag: required, End tag: forbidden.
XHTML2.0: <meta name="description">How the proposed XHTML2.0 will present meta data</meta>
XHTML Compatibility Guidelines:
C.2. Empty ElementsInclude a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the minimized tag syntax for empty elements, e.g. <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing user agents.
Technically, isn't the <meta /> element an empty element? It looks like the proposed XHTML2 will change that and encourage proper closing tags for meta elements. Though, it looks like shorthand will still be allowed. But.. that's all a long way off! IE, ya know...
Couldn't you just move your NS4 degrade message further down the page?
graham - therein we have a problem. As previously mentioned, the site is visual intensive. There is no copy on the index and main pages except an image or images. Further down the site there are a couple of pages with copy but these aren't significant. Move the notice further down the page and all we'll get are the other bits of irrelevant text.
As for doing some "browser sniffing". From the logs the overwhelming majority of browers is IE 6. 8% of the rest are 5th generation and older. Could scrap these backbending efforts and take the gap leaving behind the stragglers and the handicapped I supppose.
I'm guessing here but I reckon he has his "degrading messaging" (LOL) in a named div...
You're right. This is what I have:
<p class="ahem">This message only appears if style sheets...snipped.</p>
On the CSS page I have: .ahem {display: none;}
All this managed through this tag:
<style type="text/css" media="screen">@import "xxx/text.css";</style>
On each page there is a basic NS4 complaint style sheet.
And, as you may have guessed, what does google have in their cached link? The degraded page. Great.
media="screen"? Yes, have a style sheet for "screen" and other for "print". With the print style sheet, background and all other extraneous stuff like navigation has been stripped out giving the user a clean page to print.
In fact, couldn't have done any of this without CSS.
g1smd,
I will sort out the case aspect presently.
You could still hide it.. and, since your text message would be presented as an image, you would solve the meta description issue while still providing the intended message for "bad" browsers.
In fact, you might add an additional 'ahem' text block that provides an accurate page description.
- papabaer
In fact, you might add an additional 'ahem' text block that provides an accurate page description.
That's what we're figuring on. That might be a Plan 'B'. I've just "re-jigged" the meta tags. I'll give those a few weeks to see what happens.
If there's no improvement, I'll put in that additional text block.
As for the idea of putting in an imaged notice. Not a good one. At the moment the vanilla CSS pages vary between 3300 and 11,000 bytes. Add in the images and they go up to 60 - 70 kb a piece. We're trying to cut back on the weight/wait hence the reason for taking the CSS route in the first place.
BTW, thanks for the info on XHTML 2.0. Compliance by another name ...