Forum Moderators: open
Not sure what google's story is in this regard, but here we go...
In February 2003 completely refurbished my site using CSS and making it XHTML 1.0 compliant. The pages validate against W3 and other tools. Pages look good using most current browsers. So far so good.
Did a search on google. Site is listed but full of stuff that bears no relation to what the my site is about.
Did a check using Sim Spider. Guess what? No meta description and no meta keywords. Fiddled around and removed the closing forward slash. Retested and hey, there's the description and all the keywords.
So, we seem to be drawing the same conclusion as many others here have - google cannot read XHTML compliant web pages.
There's more. As mentioned, completely refurbished the site, moved stuff around, added more pages, etc. 4 months down the road, the previous site's description is still being used - and yes, we have the June 15 date stamp. No big deal there.
However, what's worse is that all the search result's content now relates to items in various CSS tags - section titles, navigation markers and copyright notices. Click on "similar pages" and what do I get there? My notice to users using non-compliant and early generation browsers.
In fact this notice appears in the other search engine results as well - being the first bit of readable copy on each web page. Not visible in compliant browsers but is there for others.
To cap this all, ALLTHEWEB has something similar but at least they have the CORRECT DESCRIPTION of each of the pages listed - this obtained from the meta tag description. Go figure.
All this may sound like a whinge. However, I now seem to be stuck between back peddling as in going back to HTML 4.01, corrupting my XHTML compliance or coming up with some sort of "exotic" fiddle to try and fix up this mess. The issue here isn't PR as such but rather, an accurate representation of what I have out there on the Internet.
Besides the usually stuffm this is what pops up:
... this week's picture 'heading' commissioned >> personal >> stories >> all content - ©2003 xxx xxx - all rights reserved.
commissioned >> personal >> stories >> are the navigational links to other parts of the site.
Perhaps the point I'm trying to make here is that the content listed above is inelegant to say the least.
I may need to perpuate a fix of some sort. Something akin to adding in, as someone here has already suggested, a p.title tag with the appropriate and descriptive content tacked in there.
I am sure that this is not was intended when those guys at W3 put together the XHTML spec ...
what's worse is that all the search result's content now relates to items in various CSS tags
commissioned >> personal >> stories >> are the navigational links to other parts of the site.
Do they appear in the first sections of your html document? Then try to reorganize your divisions. Some people put navigation and other second-level-content to the very end of the html page leaving the *real* content right after <body>...
I don't think this xhtml-related though.
Look here:
[webmasterworld.com...]
I think that will help you.
Thanks for your response. Put it this way, my site is image intensive. On the main page (index.html) there is nothing there but an image and then a header and by-line of sorts, the navigation links and copyright notice. With this layout, it would thus be difficult to move any of this stuff around short of adding in some "invisble" content - maybe the stuff in the description meta tag.
What blew my cool on this whole thing was using Sim Spider. Using the XHTML recommended meta closing tag in the style of "xxx. />", no description or keywords showed up in Sim Spider. Take out the forward slash and everything pops up.
Now, looking at that result and seeing all the other garbage that is now appearing in the search engine results - google as elsewhere - has brought forth the conclusion that is now the subject of this thread.
Lots of graphic stuff too, all valid XHTML---but! Always in conjunction with accessibility concerns.
There is where you need to focus.
Google.. and your GOOGLE description, will love you for it.
Great rankings, very 'user-friendly' SERP/Descriptions.
Thanks for the follow up. Took at look at that thread. Made the changes and ... back to square one.
Put in the </meta> tag and this to no avail. Did a Sim Spider validation and bye-bye meta tag description and keywords. Nothing. All I get are the words making up the page. As mentioned in response to waldemar above, my site is image intensive and other than navigational links and stuff like copyright notices, there is very little content on these pages. So, the meta tag page description is vital.
Other than the above observations I am at a loss in figuring out why ALLTHEWEB has got the page description right and all the SE's - like Sim Spider with the correct XHTML syntax - seem to give it a miss?
Hence the above conclusion, most search engines do not work effectively with XHTML compliant pages.
BTW - forgot to mention that the previous version of my web site which was HTML 4.01 compliant - none of these issues were apparent thus reinforcing my opinion. And again, page ranking is not an issue here, the technical details are.
Yes, I've tried to be a good boy here as well. Besides ALT's to all my images and the usual "accessibility" features I also have put in an NS 4 type style sheet as well as notice for those browsers that cannot handle CSS 1.0 and XHTML compliant pages. Let's say this, these pages have been designed to "degrade gracefully" which is what they do as far as I have ascertained and, therein lies the rub.
Not funny when something like this pops up on the SE's results page:
"This message only appears if stylesheets have been switched off ...." etc.
Not what was intended. I'd prefer it that the SE's pick up the meta tag page description. This is not happening'
Looks like plan B is to add in p.description CSS tag. This would be invisible in current browsers but would appear in early generation browsers. Wouldn't go amiss amongst all the other stuff lying around there I suppose.
One of the first things I did was check out one of papabaer's sites to see what he was doing. He's using the preferred method /> of course.
So, it look likes the only way to appease SIM Spider is to close the </meta>. There are more than a couple right now who are chalking up their problems to this issue. When I see multiple complaints I usually raise an eyebrow and check things out.
[edited by: pageoneresults at 1:07 am (utc) on June 17, 2003]
I've never used <></meta>, /> for all my meta descriptions, and GOOGLE (as well as ATW, et al) pickup the descriptions just fine.
I do keep meta descriptions short... many times choice (well positioned page text) appends the meta description.
I don't worry about 'sim spider' -- gut instinct and experience are good guides for positioning text that you would like to see added to your serp/page description.
[edited by: papabaer at 12:56 am (utc) on June 17, 2003]
Er - as we say - made something of an oversight here. Didn't see the '>' in the ></meta>. Slapped the </meta> at the end of the line in question without the closing bracket. No wonder.
Fixed that up, tested and it now works in Sim Spider - (big red face).
Guess we'll need to wait a few weeks now to see what effect this has.
Still, I guess if it wasn't for this forum none of this would have ever been made apparent. Apologies and thanks guys.
XHMTL and Spidering [webmasterworld.com]
Apparently ATW either has a DTD knowledgeable spider or has programmed it to accept meta tags ending with />.
That I have my "This message only appears if stylesheets ..." as "hidden text" and google and the other SE's are picking up on this seems to be neither here nor there.
As "hidden text" this may be a little contentious. That message doesn't appear in current browsers - IE 5 and 6, Opera 6 and 7 and hopefully NS 7. However it is set to appear in earlier browsers. So whether its "hidden" or not is another matter.
pageoneresults and papabaer
Yes, I have just now discovered the error of my ways thanks to all the guys here.
It appears that the correct syntax to use is thus:
<META NAME="description" Content="some description"></meta>
Before I was using:
<META NAME="description" Content="some description" />
This as per the XHTML recommendations. From the foregoing it seems quite clear that using this particular method "impairs" SE functionality. This with regard to picking up meta tag descriptions, keywords and the rest of the meta tag items for that matter.
Hope this helps.
In fact, I am convinced, it is the clarity of many of my meta descriptions that helps garner traffic. Whether first, fifth or tenth, a good description on the SERPS can be the deciding factor.
<added>I just checked a number of recently added pages (last several days), and the descriptions show fine---using the preferred /> meta/closing XHTML format.</added>
<added title="second addendum">I'm looking at one page now.. Google indexed it yesterday. The entire meta description is list on the results page.</added>
As an example, the text in one of my page descriptions looks like this:
"xxx - smithfields meat market, the story"
The "xxx" part is my name. I don't think that could be any more simple. As mentioned before, ATW has picked up this description. As for the other SE's - they have that NS4 degrade message.
Oaf357 and papabaer,
As mentioned before, the item that blew my cool was using Sim Spider. Nothing comes up when using ..." />. It works when using ..."></meta>. To me, that's been the deciding factor.
The jury is out. Have made the modifications. Let's give it a week or so and let's see what comes back. Either way, I'll report back - be it to this thread or in a new one.
It appears that the correct syntax to use is thus:
<META NAME="description" Content="some description"></meta>Before I was using:
<META NAME="description" Content="some description" />
Maybe there's a problem... tags must be lowercase in xhtml.
(Since when are meta tags so much back in fashion again? I guess, putting that description in your regular content body would improve your search engine results...)
When I first put my site up I wasn't using them (description and keyword). Then I added description and got a nice improvement in rankings. Then I added keywords and not only the nice improvement in rankings but a nice jump in traffic too.
I'd say that meta tags are still in fashion when properly used. In my case, no more than 150 characters (including spaces and punctuation) in descriptions and no more than twenty words in keywords tag. That seems to be the most effective method I've come up with after asking around, researching, and testing.
I'm also of the opinion that meta tags matter. ATW is currently using the meta description off each page indexed whereas most of the other SE's are not.
What started off this thread is that google, lycos and the rest all seemed to have bypassed the meta tags and, instead, are using text items that have little if anything to do with what each of those listed page are about.
What I have omitted to mention thus far is that all the pages listed by the other SE's show up with this "extraneous text". The worst case being in Lycos where they show a sequence of at least a half dozen pages all showing "This message only appears if style sheets ..." are part thereof. This is my NS4 degrade message. Clearly, this is not what was intended. This message has nothing to do with what each of those pages are about.
Hence the suggestion that, if this is the first line of text these SE's are picking up off each page, then best add in the meta tag description as that first line instead and in the same manner as the NS4 degrade message. This text would be "invisble" in current browsers and would only be visible in 3rd and 4th generation browsers - which is no big deal. In fact, this may enhance each page's usability factor. Just a back handed idea.
"This message only appears if style sheets ..." are part thereof. This is my NS4 degrade message. Clearly, this is not what was intended. This message has nothing to do with what each of those pages are about.
Care to show us a snippet of that particular code? They should not be indexing that content unless something is wrong somewhere. I'm going to guess if that is showing as your description, that the rest of the page is not getting indexed.
<added>Hey there grahamstewart, we must be on the same time schedules. ;)
NS4 and various other dodgy browsers can't handle @import so they get to see the message. Of course, spiders don't use CSS either - so they also see the message and index it as the first (and therefore most important) thing on his page.
(pageone: its 9:45am here in Oz, must be later than that in California I guess).
You need lower case on all tags and attributes to have valid XHTML code:
<meta name="description" Content="some description" />
Check using [validator.w3.org...] that there isn't some other reason for the observed phenomenom.