Forum Moderators: open

Message Too Old, No Replies

Does Google crawl CSS files?

It seems like it they do...

         

edit_g

6:00 pm on Sep 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Host: 64.68.86.9
Url: /example.css
Http Code : 200
Date: Sep 12 23:13:23
Http Version: HTTP/1.0"
Size in Bytes: 3323
Referer: - Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

Is this normal?

iThink

3:07 am on Sep 14, 2003 (gmt 0)

10+ Year Member



Never seen this before in my logs. Interesting development.

edavid

10:44 pm on Sep 14, 2003 (gmt 0)

10+ Year Member



Oh, no! Better go validate all my style sheets.

Wonder if they're looking for H1 resizing....

DaveN

10:48 pm on Sep 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



saw this a few weeks backs in the logs of a site which had been dropped from the serps, the CSS was using a dodgy method at the time.

DaveN

ronin

11:32 pm on Sep 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well... whether they're looking for <hx> resizing or not, it's very unlikely that Google would consider that a 'dodgy' method.

It's a little senseless to argue that IE, Mozilla, Opera, Safari etc. know better than the author of a page how a heading should look on that page and I'm sure that Google wouldn't defend this idea...

abbeyvet

11:43 pm on Sep 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Unless it is resized to 1px

iThink

11:45 pm on Sep 14, 2003 (gmt 0)

10+ Year Member



DaveN,

Were you using CSS for adding hidden text on the pages that were dropped from the SERPs or was it visible text with very small font size?

metagod

12:10 am on Sep 15, 2003 (gmt 0)

10+ Year Member



Wouldn't they be more interested to see whether they are using display: none; in their stylesheets?

I would say that would be a bigger issue then who is resizing text...

requiem

1:26 am on Sep 15, 2003 (gmt 0)

10+ Year Member



Yes, I have seen that to.
It makes perfect sense since using CSS to trick the SE is so common. I guess it is not an good idea to use fixed font-sizes.

Of some reason it seems like Google doesn't
follow the @import directive

Eg.

<link rel="stylesheet" href="style1.css" />
<style type="text/css">
@import url("style2.css");
</style>

Here Google will only crawl style1.css and not style2.css.

I hope they don't start to drop sites using hidden text, there
are several accessability issues that would be affected.

TheWhippinpost

2:04 am on Sep 15, 2003 (gmt 0)

10+ Year Member



>> there are several accessability issues that would be affected.
<<

Precisely!

It's difficult to argue any case of penalisation when there's an equal, if not greater justification for using anything that CSS allows... Even negative-valued div's have a "legal" and innocent use to people.

Interesting, if there's any mileage in this anyway!

DaveN

7:16 am on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



iThink, it was a layer pushed to a - position.

DaveN

percentages

7:19 am on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think it is like JS, they grab it but can't truly crawl it or understand what it means. That begs the question as to why they grab it....I assume they are looking for links in any include;)

kaled

9:23 am on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If Google is serious about penalizing hidden text then scanning CSS files is inevitable. The fact that it has taken so long to implement seems more interesting to me.

Kaled.

keeper

11:50 am on Sep 15, 2003 (gmt 0)

10+ Year Member



I agree. They will need to if they are going to find the two biggest spam techniques:
css: display:none
js: location.href (and derivatives)

I think the js will be the hardest as you can affect redirects in countless constructs.

Looking for "display:none" or negative positioning would be trivial for Google I would think.

The reason they haven't done it till now is probably the bandwidth involved or server resources it would drain from normal crawling.

Imagine what it would take to download every css or js file...and not index any of it. Just for the sake of battling spam....

TheWhippinpost

12:28 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



I agree. They will need to if they are going to find the two biggest spam techniques:
css: display:none
js: location.href (and derivatives)

If they do, they'll be an uproar for a multitude of entirely legitimate reasons...IMHO

rrdega

12:28 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



Imagine what it would take to download every css or js file...and not index any of it. Just for the sake of battling spam....

With the spamming that's taking place via CSS, and the negative effect its having on SERPs, wouldn't it be worth it? Wouldn't it make Google's SERPs stand out as superior as compared to its peers? I think so, and believe it to be a logical step. Regardless of the costs...

Purpose and reasoning aside, couldn't you simply block GoogleBot from your CSS with robots.txt?

[edited by: rrdega at 12:36 pm (utc) on Sep. 15, 2003]

driesie

12:35 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



Google could only penalise sites using display:none etc if they check them manually.
You are talking about techniques that were invented for a reason other than spamming search engines. there's plenty of reasons why you would use it (navigation systems for example).

The problem is not as easy as it seems, in my opinion.

keeper

12:47 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



Indeed, not an easy problem, though i doubt they would penalise whole sites or pages for using it. As you say - it has legitimate functions. But so has the <noframes> tag, the alt tag, the <noscript> tag. All of which have been wound down by Google (at least in my experience)

I'd imagine they would treat the text in any display:none layer similarly. Use it only if I can't rank this page situations.

Navigation (unless its extremely content rich navigation) rarely makes a difference to on-page relevancy anyhow.

Rosalind

1:50 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've just built a navigation bar using css popup text, as described on this page:

[meyerweb.com...]

I'm not getting rid of it, the people I'm building the site for really like it. But it uses display:none; Is the googlebot smart enough to distinguish text that is visible only when you hover over it, from truly hidden text?

rogerd

2:01 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Rosalind, right now Googlebot isn't that smart. Trying to accurately sort out spam "display:none" instances vs. legit ones is too difficult, I'd say.

How could Google use this information?
- a subtle downgrade on pages with significant hidden text, whatever the reason.
- as one indicator of possible dodginess, to be correlated with other indicators or spam reports.
- as something to watch for should a manual review be necessary for some reason.

I really like the idea of decorative image heading swapping via CSS, but the spam implications make me nervous.

rrdega

2:14 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



Since this seems to be a hot topic, and one I am also personally interested in as well, I'd like to raise the question once again that I posted as an added thought earlier...

I may be missing something really simple here, as to date I've only been using CSS for basic formating, but couldn't we simply block GoogleBot from the CSS with robots.txt?

requiem

2:55 pm on Sep 15, 2003 (gmt 0)

10+ Year Member




I may be missing something really simple here, as to date I've only been using CSS for basic formating, but couldn't we simply block GoogleBot from the CSS with robots.txt?

If Googlebot wants the layout and formating information, it
would be just a little bit fishy to block it. A better solution must be to just write super safe CSS. A better solution is to write the appropriate alternative CSS to take care of accessability issues. It is possible to take care of all the hidden navigation information with the :before and :after pseudo classes.

edavid

2:57 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



Rosalind, I am also building menus like that. Meyer is such a css guru that surely Google is aware of brilliant techniques like his versus spamming with hiding text and won't hurt the good to defend us from the bad. Hopefully.

Everyone's looking for things that Google might be penalizing for. I think Google might also be giving sites a slight boost if their css validates. Web standards are becoming increasingly important as the Internet becomes a true mass medium.

davemarks

3:07 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



First off, apologies for kinda hijacking this thread, but its relevent

A site i'm in the process of updateing uses pics for headings. I hate this but have no choise in removing it. What I wanted to do was put a title in a <h1> tag also and then hide it uses css. Either off the page or placing the title image over it with css positioning.

I don't believe this to be spamming, as I'm just letting google read what the viewer can see... although agreed its kinda showing one thing to google and another to viewers, but only kinda ;)

Do you consider this spamming? Would google disapprove? Can you suggest a better alternative?

From the sounds of it, putting this into a css file should stop googl noticeing what i'm doing, but i don't want to do anything wrong as i'm not into cheating!

Your advice would be much appreciated

Thanks

edit_g

3:13 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think we all need to hold our horses a little bit.

Google does not usually attack problems with a sledgehammer, they're usually quite subtle with things like this.

I don't think they will penalise for using display:hidden or any other legitimate CSS techniques. Nor do I think they will give sites a boost if the CSS validates. Not that what I think matters, but I think we may need to wait for a bit to see how it pans out.

The first thing which occured to me was that they may be taking the CSS file and trying to use it to clean up the way the cache looks for heavily css'd sites (because they can look awful when you look at the cache if they have absolute positioning).

kaled

3:45 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Purpose and reasoning aside, couldn't you simply block GoogleBot from your CSS with robots.txt?

There is nothing to stop you adding instructions to the robots.txt file to block CSS files, however, Google may either simply disregard such instructions or assume (rightly) that the block is to hide dubious code.

If the decision were mine, I would ignore a block on CSS files taking the view that they are an intrinsic part of the HTML pages being spidered. However, only CSS files called by pages being spidered are likely to be scanned.

Add the issue of the cache and things start to get really interesting.

Kaled.

moltar

4:35 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can always cloak CSS for Google. IMO that is what spammers would do anyways. You can preload it with a complex java script. There are tons of other ways to hide CSS from Google.

This would just remove lots of legit sites from Google, but spammers would stay anyways!

I think it's a very bad idea!

edit_g

5:14 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think it's a very bad idea!

What is a very bad idea? Google is crawling CSS files. This is the only concrete information anyone here has. The rest is just the usual paranoia which gets thrown up whenever google scrathces its backside.

stuntdubl

5:57 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's difficult to argue any case of penalisation when there's an equal, if not greater justification for using anything that CSS allows... Even negative-valued div's have a "legal" and innocent use to people.

There are legitimate uses for many techniques that are used for spam. I use - divs to center align fixed elements. This is the same as having keywords stuffed in alt text or any other technique that has the possibility for SE manipulation. There are viable techniques, and google will handle the problem protecting AS MUCH of the innocent as possible

Google does not usually attack problems with a sledgehammer, they're usually quite subtle with things like this.

Exactly. I don't think google is going to nail EVERYONE to the wall, but it if you REALLY want to be careful protect yourself from being an innocent casulty and use the techniques that can be used for spamming sparingly.

netnerd

6:10 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



I know of 3 sites that are heavily interlinked using -ve layers to make them non-visible.

They are dominating rankings using this techniqe.

Whats the best way to report them?

This 41 message thread spans 2 pages: 41