homepage Welcome to WebmasterWorld Guest from 54.234.128.25
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt
How effective is it anyway
circuitjump




msg:1529156
 6:31 pm on May 10, 2001 (gmt 0)

Hi all,
Just want to find out what others think of robots.txt. the question is " Is robots.txt helpful or not? " I know we all talk about it and use it but I read posts here and elsewhere where people say yes it works and no it does not work. So what is it?
if anyone has any input, please let me know.
Thanks

 

mivox




msg:1529157
 6:44 pm on May 10, 2001 (gmt 0)

robots.txt is essential to my site... I've got areas of the site I don't want spidered, and robots.txt is the only way to prevent 'good' spiders (like googlebot) from indexing those areas, while still allowing them to index the rest of the material.

robots.txt DOES work, IF:

1. You have all of your statements formatted correctly. Yesterday I had a spider plow through an area I *thought* was blocked, but since I had the line blocking that area written incorrectly, it didn't work.

After emailing the spider's owner (antarcti.ca), determining the problem and fixing it, the terrifically nice folks at antarcti.ca's tech dept. sent their spider through again, and my robots.txt worked like a charm.

2. The robot in question follows robots.txt conventions. All of the major search engines and important/good spiders DO follow robots.txt instructions...

Any robot I find that doesn't request a robots.txt file, or ignores *properly formatted* directions therein, is banned form my site via htaccess, and loud complaints are sent to its owner. 

circuitjump




msg:1529158
 7:55 pm on May 10, 2001 (gmt 0)

Thanks mivox for the input.
Now how do I go by exactly trying to write a robots.txt that I know will work. Also if any of ya'll have a web resource that you think is very discriptive please post it so that I may take a look at it.

Thanks,
Circuitjump

physics




msg:1529159
 8:49 pm on May 10, 2001 (gmt 0)

Welcome to WebmasterWorld circuitjump. Why not try SearchEngineWorld's own Robots.txt Tutorial [searchengineworld.com] It's a great resource. Also, try:

Robots.txt Validator [searchengineworld.com]

Robots Exclusion Meta Tag [searchengineworld.com] Using robots metatags.

Robots.txt : The Big Crawl [searchengineworld.com]We recently spidered 2million robots.txt files and found a surprising number of problems.

Robots Exclusion Standard rfc4 [info.webcrawler.com].

Root of Robots Exclusion Standard [info.webcrawler.com] directory with some interesting files.

Search Indexing Robots and Robots.txt [searchtools.com] article at searchtools.com.

Cheers :)

mivox




msg:1529160
 12:03 am on May 11, 2001 (gmt 0)

You could also look at mine here [absak.com], since it's just gone through the wringer and gotten all fixed up...

circuitjump




msg:1529161
 2:35 pm on May 11, 2001 (gmt 0)

Thank you all

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved