Forum Moderators: Robert Charlton & goodroi
What will happen to your site's standing in the SERPs the day Googlebot learns CSS?
The day Google implements CSS file crawling and understanding I expect massive changes in the ranking of pages. Could that day be coming soon? Could Google be looking at embedded information now and experimenting? Webmasters are trying to move to new technologies and certainly some have had a bad experience.
Google could start to simply query a website just like a browser does and then parse the object model. How would your pages rank when Google actually "looks" at them?
Google's Web Accelerator gives Google this perspective today. The accelerator pre-fetches the entire web page's content and more onto Google servers! Using this technology Google could be learning and analyzing CSS file content today.
Fortunately the last time I looked the accelerator was slower, actually causing two fetches (GETS) of all content, the original by your browser and the same request duplicated by the accelerator (slooowww).[webaccelerator.google.com...]
I'm just interested in other webmasters perspectives.
Using this technology Google could be learning and analyzing CSS file content today.
I know this has been raised several times before particularly in relation to using display:none in a CSS file to hide text and whether or not search engines can detect this.
What happens though if you include the name of your CSS files in robots.txt to specifically prevent SEs crawling them?
Google could start to simply query a website just like a browser does and then parse the object model. How would your pages rank when Google actually "looks" at them?
Hopefully the same way they do now as nothing is different.
Google has a hard enough time now dealing with html/xhtml. Parsing CSS files and determining whether something is hidden or not is not a solution. Now the bot would need to determine why that CSS exists. There are many valid uses of display:none or display:hidden.
The bots are definitely smarter than they were a few years ago. But, I don't think they are ready for CSS.
For those who may be hiding things through CSS or negatively positioning content off screen to manipulate page content, I surely wouldn't do that with any long term projects. ;)
The penalty for getting busted using this technique I would imagine is a permanent ban. No if's, and's, or but's, you're history. You'll need a pardon from the Governor to be reconsidered for inclusion. ;)
What happens though if you include the name of your CSS files in robots.txt to specifically prevent SEs crawling them?
The bots are definitely smarter than they were a few years ago. But, I don't think they are ready for CSS.
I agree completely, but in thinking about it, CSS has got to be easier to understand than English! Googlebot sure hasn't got this down yet. (Some bad english in these sentences.)
One reason I'm asking is due to the release of Expression Web. It's clear quite a few web masters have been astonished by this release. Useful features have been gutted only because they were poorly implemented or implemented in a non-standard (because there was no standard) way. The features were useful and could have been re-engineered with new technologies and adhered to standards, perhaps even transparently to the website user. (But probably not to Google, thus this thread)
One possibility in the near future is massive conversions of sites with things as simple as font tags versus styles. I'm not sure I've seen a lot of discussion about the impacts of this type of change on web page ranking in Google SERPS.
Does Google not even understand a simple style applying a font? If not, converting to style sheets in essence removes all the formating information Googlebot used to see.
I have seen postings by quite a few webmasters who have "re-engineered" their sites and have found their rankings go down the tubes. I have also seen some who have redesigned and apparently had few or no problems.
In the past I've had ranking problems (and Adsense problems) because Google was parsing footer and navigation information. Putting this extraneous non page specific information into IFrames improved rankings and Ad targeting. Expression Web will basically drive many webmasters into reformatting their entire sites navigation mechanism and perhaps header and footer design. (I know some/many will say don't use EW!)
I've just fiddled a little bit, a couple of times, with one or two site's navigation structures and have seen some indications the changes were doing grave harm to page ranking and pulled the changes, fortunately for a fairly quick recovery.
Anyway thanks again for your thoughts.
I did this by using { display: none } in the print stylesheet for navigation class codes and in the regular stylesheet for the class code of the page URL. Every page has two <link rel> lines in the header in the form:
<link rel="stylesheet" href="http://www.nnn.com/stylesheet.css" type="text/css">
<link rel="stylesheet" href="http://www.nnn.com/printstylesheet.css" media="print" type="text/css">
I'm pleased with the way it works but concerned that it might confuse Google. I had assumed that there would be no problem with Google or other search engines, but in light of the messages above, I wonder if Google might assume that the site is hiding text because the regular stylesheet hides the page's own URL.
At this point I'd say all the commentary is at most conjecture. I don't think Google is parsing styles yet, but I am concerned about the long term, and want to have some strategy for implementing more and more new technology, when clearly Google(bot) is not yet keeping up. For example, even Google's home page is not fully standards compliant.
Many sites are fully standards compliant, using styles etc., and I'm sure rank well, I'm more concerned about the transition. I do think there are other threads discussing this topic on a piecemeal basis.
And the payoff is just not that high. As the subject of the thread suggests, the relevance of search results could even get worse if Google starts scoring text based on how close it appears to the top left corner of the page, to use one example.
What happens though if you include the name of your CSS files in robots.txt to specifically prevent SEs crawling them?
This would be interesting - I expect this would be a signal for a site needing a manual review. Set off enough flags and someone will check you manually. If you are going to fly up in the sky like that (as opposed to down in the grass ;)) then your site better be itchy clean!
Vanessa Fox in her recent interview with Rand Fishkin was quite clear that the bots are only interested in the content.
So, all tags are stripped away - they don't care if you wrap content in font tags or in spans.
I think there are some exceptions for a specific subset, such as h1 and b (bold) but not much else.
Vanessa Fox in her recent interview with Rand Fishkin was quite clear that the bots are only interested in the content. So, all tags are stripped away - they don't care if you wrap content in font tags or in spans.
I would take Vanessa's comment in context - she was responding to Rand's question about whether the content to code ratio is a ranking factor. I think I see Google doing a lot with many of the html elements, even if currently the script elements and style attributes are not playing into very much except for manual inspections.