Welcome to WebmasterWorld Guest from 54.167.157.247

Message Too Old, No Replies

Keeping googlebot out

   
11:40 pm on Feb 6, 2005 (gmt 0)

10+ Year Member



Hi,

I do very well in google but terrible with MSN search, how do i keep googlebot out of my website i am optimising for msn search please?

Bek.

11:59 pm on Feb 6, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Huh? Why would you want to keep Googlebot out if you're doing well with it? You don't want to bite the hand that feeds you.
12:04 am on Feb 7, 2005 (gmt 0)

10+ Year Member



Sorry Diamondgirl.
I probably didnt explain the question correctly.

I meant to say i am building a whole new page for MSN search and do not want google to index the page as it will be optimised for MSN.
How can i do this?

Bek.

2:26 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can either create a robots.txt or .htaccess based denial.

With robots.txt you let googlebot figure out that you don't want them to index (they might still do it!).

With .htaccess you can deny using a useragent (say anything that has "google" in the UA string) -- googlebot will not even get to the robots.txt.

6:54 am on Feb 7, 2005 (gmt 0)

10+ Year Member



Google will find out. Do you really think they always identify themselves? How could they possibly stop cloaking?
7:03 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How could they possibly stop cloaking?

True, but Bek isn't talking about cloaking.

The objective is simply to exclude Googlebot alltogether - something which robots.txt is designed to do and Googlebot will no doubt honour. There is no intent do deceive.

7:57 am on Feb 7, 2005 (gmt 0)

10+ Year Member



Well, first of all there *is* an intent to deceive. He is making two versions of a page, one for MSN, one for Google. This is cloaking without sharing a URL. If he really means a "page," it may well be doorway-making too.

Apart from mundane infelicities—you sacrifice links to one of these pages--I don't think it will work to keep Google out. I remember reading somewhere that Google doesn't promise not to visit pages it's excluded from; it only promises not to put them in the SERPs. That's how they keep people honest. After all, you could have a single website with three versions of every page at different URLs.

I'm sure this has been tried, and Google's caught it.

8:07 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member ciml is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> I remember reading somewhere that Google doesn't promise not to visit pages it's excluded from; it only promises not to put them in the SERPs.

I hope you didn't read that here, suidas!

The opposite is the case. If Google sees a link to a /robots.txt excluded URL, then Google will not fetch it. It can still list the URL in the results without having to fetch the page.

If a URL is not /robots.txt excluded, but has a META robots tag with 'noindex', then if Google fetches the URL it will not be listed.

> sure this has been tried, and Google's caught it

That someone has a Web site, and that there's some part of the Web that shouldn't be crawled? I don't see how Google would see a quality issue there, unless they didn't like the site they were crawling.

The 'dual site' approach using Robots Exclusion Protocol would sacrifice links though.

8:35 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The 'dual site' approach using Robots Exclusion Protocol would sacrifice links though.
But how important are links for a page to do well on MSN?