homepage Welcome to WebmasterWorld Guest from 54.205.254.108
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
google allow - disallow all others
jimmy19




msg:1529026
 4:22 pm on Jul 19, 2004 (gmt 0)

Does anyone have a sample of how I would set up a Robots.txt file to allow googlebot but disallow all other spiders from the whole web site?

Thanks for the help!

:)

 

wkitty42




msg:1529027
 2:07 pm on Jul 29, 2004 (gmt 0)

surely you've found the answer to this by now? ;)

sem4u




msg:1529028
 2:11 pm on Jul 29, 2004 (gmt 0)

You can only disallow spiders if you know the names of them and they will all have to be entered into the robots.txt file.

There is a good list here though:

[webmasterworld.com...]

;)

encyclo




msg:1529029
 2:41 pm on Jul 29, 2004 (gmt 0)

Welcome to WebmasterWorld, jimmy19.

The safest way would be to cloak robots.txt to deliver a disallow all to anything other than Googlebot.

The basic process is to use mod_rewrite to redirect calls for robots.txt to, say, robots.php, and in the latter file, check the IP address and user agent string to identify Googlebot, and then print the appropriate robots.txt declarations. You could even place all IPs other than Googlebot accessing the robots.txt file on a banned list to ensure that they can't spider the site.

Rather complicated, but the only sure way I know of!

jimbeetle




msg:1529030
 3:01 pm on Jul 29, 2004 (gmt 0)

For "well-behaved" robots, those that obey robots.txt, this is the syntax recommended by robotstxt.org:

User-agent: Googlebot
Disallow:

User-agent: *
Disallow: /

jimmy19




msg:1529031
 7:20 am on Jul 30, 2004 (gmt 0)

I will take a look at these. Thank you for the replies...

Jim :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved