When Google understands CSS files - Google Search and SEO forum at WebmasterWorld - WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

When Google understands CSS files

What will happen to your page's ranking in SERPs

bumpski

11:26 am on Dec 17, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Many Web site designers are using technology today that Google(bot) still does not understand.

What will happen to your site's standing in the SERPs the day Googlebot learns CSS?

The day Google implements CSS file crawling and understanding I expect massive changes in the ranking of pages. Could that day be coming soon? Could Google be looking at embedded information now and experimenting? Webmasters are trying to move to new technologies and certainly some have had a bad experience.

Google could start to simply query a website just like a browser does and then parse the object model. How would your pages rank when Google actually "looks" at them?

Google's Web Accelerator gives Google this perspective today. The accelerator pre-fetches the entire web page's content and more onto Google servers! Using this technology Google could be learning and analyzing CSS file content today.

Fortunately the last time I looked the accelerator was slower, actually causing two fetches (GETS) of all content, the original by your browser and the same request duplicated by the accelerator (slooowww).

[webaccelerator.google.com...]

I'm just interested in other webmasters perspectives.

lightheaded

6:06 pm on Dec 17, 2006 (gmt 0)

10+ Year Member

Using this technology Google could be learning and analyzing CSS file content today.

It could certainly be doing this but Google say their crawler sees a page in the same way that a text only browser does (eg. Lynx). This has always led me to believe that Google doesn't crawl CSS files.

I know this has been raised several times before particularly in relation to using display:none in a CSS file to hide text and whether or not search engines can detect this.

What happens though if you include the name of your CSS files in robots.txt to specifically prevent SEs crawling them?

pageoneresults

6:57 pm on Dec 17, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Google could start to simply query a website just like a browser does and then parse the object model. How would your pages rank when Google actually "looks" at them?

Hopefully the same way they do now as nothing is different.

Google has a hard enough time now dealing with html/xhtml. Parsing CSS files and determining whether something is hidden or not is not a solution. Now the bot would need to determine why that CSS exists. There are many valid uses of display:none or display:hidden.

The bots are definitely smarter than they were a few years ago. But, I don't think they are ready for CSS.

For those who may be hiding things through CSS or negatively positioning content off screen to manipulate page content, I surely wouldn't do that with any long term projects. ;)

The penalty for getting busted using this technique I would imagine is a permanent ban. No if's, and's, or but's, you're history. You'll need a pardon from the Governor to be reconsidered for inclusion. ;)

bumpski

7:56 pm on Dec 18, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Thanks for the feedback!

What happens though if you include the name of your CSS files in robots.txt to specifically prevent SEs crawling them?

An excellent question, I sure don't know the answer. I know historically "Googlebot" is not crawling CSS files, and my impression is rankings may actually have improved very slightly when there was no CSS file at all, only embedded styles. In this case Google does know it is not being tricked by CSS file content.

The bots are definitely smarter than they were a few years ago. But, I don't think they are ready for CSS.

I agree completely, but in thinking about it, CSS has got to be easier to understand than English! Googlebot sure hasn't got this down yet. (Some bad english in these sentences.)

One reason I'm asking is due to the release of Expression Web. It's clear quite a few web masters have been astonished by this release. Useful features have been gutted only because they were poorly implemented or implemented in a non-standard (because there was no standard) way. The features were useful and could have been re-engineered with new technologies and adhered to standards, perhaps even transparently to the website user. (But probably not to Google, thus this thread)

One possibility in the near future is massive conversions of sites with things as simple as font tags versus styles. I'm not sure I've seen a lot of discussion about the impacts of this type of change on web page ranking in Google SERPS.

Does Google not even understand a simple style applying a font? If not, converting to style sheets in essence removes all the formating information Googlebot used to see.

I have seen postings by quite a few webmasters who have "re-engineered" their sites and have found their rankings go down the tubes. I have also seen some who have redesigned and apparently had few or no problems.

In the past I've had ranking problems (and Adsense problems) because Google was parsing footer and navigation information. Putting this extraneous non page specific information into IFrames improved rankings and Ad targeting. Expression Web will basically drive many webmasters into reformatting their entire sites navigation mechanism and perhaps header and footer design. (I know some/many will say don't use EW!)

I've just fiddled a little bit, a couple of times, with one or two site's navigation structures and have seen some indications the changes were doing grave harm to page ranking and pulled the changes, fortunately for a fairly quick recovery.

Anyway thanks again for your thoughts.

tr95

9:46 pm on Dec 18, 2006 (gmt 0)

10+ Year Member

Following advice I found on another webmaster support site, I just added a "print" stylesheet. When the page is printed, the print stylesheet takes over and removes unneeded material, such as navigation links, and adds the URL of the page at the bottom so that the user will be able to find it again.

I did this by using { display: none } in the print stylesheet for navigation class codes and in the regular stylesheet for the class code of the page URL. Every page has two <link rel> lines in the header in the form:

<link rel="stylesheet" href="http://www.nnn.com/stylesheet.css" type="text/css">
<link rel="stylesheet" href="http://www.nnn.com/printstylesheet.css" media="print" type="text/css">

I'm pleased with the way it works but concerned that it might confuse Google. I had assumed that there would be no problem with Google or other search engines, but in light of the messages above, I wonder if Google might assume that the site is hiding text because the regular stylesheet hides the page's own URL.

bumpski

11:23 pm on Dec 18, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

tr95

At this point I'd say all the commentary is at most conjecture. I don't think Google is parsing styles yet, but I am concerned about the long term, and want to have some strategy for implementing more and more new technology, when clearly Google(bot) is not yet keeping up. For example, even Google's home page is not fully standards compliant.

Many sites are fully standards compliant, using styles etc., and I'm sure rank well, I'm more concerned about the transition. I do think there are other threads discussing this topic on a piecemeal basis.

jomaxx

11:52 pm on Dec 18, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I don't see this happening soon, maybe ever. There are too many connections between CSS and Javascript, too many complications due to the need to merge multiple files, some of which may be blocked by robots.txt.

And the payoff is just not that high. As the subject of the thread suggests, the relevance of search results could even get worse if Google starts scoring text based on how close it appears to the top left corner of the page, to use one example.

Patrick Taylor

12:00 am on Dec 19, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I had assumed that there would be no problem with Google

Really, there won't be a problem unless Google bases its results on a printout of every page.

leadegroot

7:31 am on Dec 19, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Currently Google doesn't view CSS external to the page (ie CSS included in a .css file). If in doubt about this, use one and see if Google *ever* crawls it. It doesn't happen. The only time I've ever seen Google appear to look at a css file turned out to be someone viewing the site through the cache.

What happens though if you include the name of your CSS files in robots.txt to specifically prevent SEs crawling them?

This would be interesting - I expect this would be a signal for a site needing a manual review. Set off enough flags and someone will check you manually. If you are going to fly up in the sky like that (as opposed to down in the grass ;)) then your site better be itchy clean!

Vanessa Fox in her recent interview with Rand Fishkin was quite clear that the bots are only interested in the content.
So, all tags are stripped away - they don't care if you wrap content in font tags or in spans.
I think there are some exceptions for a specific subset, such as h1 and b (bold) but not much else.

tedster

7:46 am on Dec 19, 2006 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Vanessa Fox in her recent interview with Rand Fishkin was quite clear that the bots are only interested in the content. So, all tags are stripped away - they don't care if you wrap content in font tags or in spans.

I would take Vanessa's comment in context - she was responding to Rand's question about whether the content to code ratio is a ranking factor. I think I see Google doing a lot with many of the html elements, even if currently the script elements and style attributes are not playing into very much except for manual inspections.