homepage Welcome to WebmasterWorld Guest from 54.196.168.78
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt files
Who to Disallow
Uniterra




msg:1529625
 9:04 pm on Feb 26, 2004 (gmt 0)

This might be a silly question, but does anyone disallow robots from visiting a site and if so which spiders do you disallow?

Any help appreciated...
Thanks,
Pete Prestipino

 

BarkerJr




msg:1529626
 2:38 am on Feb 27, 2004 (gmt 0)

For the "how", check out [searchengineworld.com...]

For the list of spiders you don't like, you have to decide that yourself. I block:
vscooter (Altavista Images)
fast
Googlebot-Image
These are bots that attempt to cache my images, which I don't like. Other people block other bots.

keeper




msg:1529627
 3:01 am on Feb 27, 2004 (gmt 0)

I find the webmasterworld robots.txt file useful as a guide.

jdMorgan




msg:1529628
 3:51 am on Feb 27, 2004 (gmt 0)

As an alternative to blocking a useful robot such as fast, you can also put all of your "proprietary" images into a subdirectory, and then disallow that subdirectory. Example:

User-agent: fast
Disallow: /images/

Jim

BarkerJr




msg:1529629
 4:21 am on Feb 27, 2004 (gmt 0)
I refuse to redesign my website for a bot I've never heard of coded by people who don't care about users who don't want their images indexed.

I emailed fast and asked them how to stop the bot from collecting images but have it still collect html. They replied with the robots.txt code to disallow /.

I emailed google and asked them the same thing. One month later they introduced the google-image agent. Obviously Google cares about webmasters and fast doesn't.

Pinetree




msg:1529630
 5:08 pm on Mar 1, 2004 (gmt 0)

I have 5 domain names all pointing to the same site.

Can I include all 5 URLs on the same robots.txt file?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved