Forum Moderators: open

Message Too Old, No Replies

Did anybody try Allow in robots.txt?

Allow only Googlebot but no others

         

HitProf

8:03 pm on Jul 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



After reading this [webmasterworld.com] thread I wondered if somebody actually tried:

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /

as described here [google.com]?

Does it work? Do other spiders still obey the disallow?

I hope the link to the Google site works as I had to compile is from the cache

wilderness

10:43 pm on Jul 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the allow function is the deault in robots and is not necessary.

In fact I recall it being a discussed "to be implemnted" option, however never implemented.

Jim touched on it in his robots diet mail.
[webmasterworld.com...]

jdMorgan

1:20 am on Jul 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



HitProf,

The correct robots.txt syntax - which does not depend on Google-proprietary directives, would be:


User-agent: Googlebot
Disallow:

User-agent: *
Disallow: /


A Disallow directive with a blank path (as above) means, "Disallow nothing." A robot should obey the first record with a user-agent string that matches its own or "*" - whichever comes first.

HTH,
Jim

HitProf

8:26 pm on Jul 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks jdMorgan, what an excellent summary, I'll use it.
And thanks for the link wilderness, I missed that one.