Keeping googlebot out

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Keeping googlebot out

bekyed

11:40 pm on Feb 6, 2005 (gmt 0)

Hi,

I do very well in google but terrible with MSN search, how do i keep googlebot out of my website i am optimising for msn search please?

Bek.

diamondgrl

11:59 pm on Feb 6, 2005 (gmt 0)

Huh? Why would you want to keep Googlebot out if you're doing well with it? You don't want to bite the hand that feeds you.

bekyed

12:04 am on Feb 7, 2005 (gmt 0)

Sorry Diamondgirl.
I probably didnt explain the question correctly.

I meant to say i am building a whole new page for MSN search and do not want google to index the page as it will be optimised for MSN.
How can i do this?

Bek.

shri

2:26 am on Feb 7, 2005 (gmt 0)

You can either create a robots.txt or .htaccess based denial.

With robots.txt you let googlebot figure out that you don't want them to index (they might still do it!).

With .htaccess you can deny using a useragent (say anything that has "google" in the UA string) -- googlebot will not even get to the robots.txt.

suidas

6:54 am on Feb 7, 2005 (gmt 0)

Google will find out. Do you really think they always identify themselves? How could they possibly stop cloaking?

dmorison

7:03 am on Feb 7, 2005 (gmt 0)

How could they possibly stop cloaking?

True, but Bek isn't talking about cloaking.

The objective is simply to exclude Googlebot alltogether - something which robots.txt is designed to do and Googlebot will no doubt honour. There is no intent do deceive.

suidas

7:57 am on Feb 7, 2005 (gmt 0)

Well, first of all there *is* an intent to deceive. He is making two versions of a page, one for MSN, one for Google. This is cloaking without sharing a URL. If he really means a "page," it may well be doorway-making too.

Apart from mundane infelicities�you sacrifice links to one of these pages--I don't think it will work to keep Google out. I remember reading somewhere that Google doesn't promise not to visit pages it's excluded from; it only promises not to put them in the SERPs. That's how they keep people honest. After all, you could have a single website with three versions of every page at different URLs.

I'm sure this has been tried, and Google's caught it.

ciml

8:07 am on Feb 7, 2005 (gmt 0)

> I remember reading somewhere that Google doesn't promise not to visit pages it's excluded from; it only promises not to put them in the SERPs.

I hope you didn't read that here, suidas!

The opposite is the case. If Google sees a link to a /robots.txt excluded URL, then Google will not fetch it. It can still list the URL in the results without having to fetch the page.

If a URL is not /robots.txt excluded, but has a META robots tag with 'noindex', then if Google fetches the URL it will not be listed.

> sure this has been tried, and Google's caught it

That someone has a Web site, and that there's some part of the Web that shouldn't be crawled? I don't see how Google would see a quality issue there, unless they didn't like the site they were crawling.

The 'dual site' approach using Robots Exclusion Protocol would sacrifice links though.

Powdork

8:35 am on Feb 7, 2005 (gmt 0)

The 'dual site' approach using Robots Exclusion Protocol would sacrifice links though.

But how important are links for a page to do well on MSN?