homepage Welcome to WebmasterWorld Guest from 54.145.183.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
how to block all robots except for 2?
possible?
ionchannels

5+ Year Member



 
Msg#: 939 posted 3:32 pm on Jun 28, 2006 (gmt 0)

I am just getting hammered by so many useless bots. I want to block all bots except for googlebot and slurp. Is there anyway to do this with robots.txt?
Thanks,
Christian

 

cheesehead2

5+ Year Member



 
Msg#: 939 posted 8:15 pm on Jun 28, 2006 (gmt 0)

This should work:

#block all
User-agent: *
Disallow: /
#except
User-agent: Slurp
Disallow:
User-agent: msnbot
Disallow:
User-agent: Googlebot
Disallow:

jimbeetle

WebmasterWorld Senior Member jimbeetle us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 939 posted 9:00 pm on Jun 28, 2006 (gmt 0)

You have that backwards. All bots, including Slurp, msnbot and Googlebot will read and obey the first Disallow.

Try this:

User-agent: Slurp
User-agent: Googlebot
Disallow:

User-agent: *
Disallow: /

Slurp and Google bot will go on their merry ways when they see the first directive, happily gobbling down pages. Others will see the second and be off to some other bloke's site.

larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 939 posted 9:11 pm on Jun 28, 2006 (gmt 0)

I never understood this. All I see are 'disallows'.
First 2 robots specifically, then a wild-card for the rest.
Why doesn't this disallow all of them? -Larry

kevinpate

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 939 posted 9:16 pm on Jun 28, 2006 (gmt 0)

What you're missing is the effect of what
comes after the :

Disallow:
(the above line says disallow nothing on site.)

Disallow: /
(the above line says disallow everything on site.)

larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 939 posted 9:17 am on Jun 29, 2006 (gmt 0)

Thanks KevinP: That makes perfect sense and explains everything. -Larry

ionchannels

5+ Year Member



 
Msg#: 939 posted 12:30 pm on Jun 29, 2006 (gmt 0)

Thanks guys, I will try that. It seems so simple, but I am always afraid to do something which will discourage googlebot. I didn't realize that the SE bots read robots.txt that way (sequentially I mean). Very cool, thanks again.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved