How effective is it anyway
| 6:31 pm on May 10, 2001 (gmt 0)|
Just want to find out what others think of robots.txt. the question is " Is robots.txt helpful or not? " I know we all talk about it and use it but I read posts here and elsewhere where people say yes it works and no it does not work. So what is it?
if anyone has any input, please let me know.
| 6:44 pm on May 10, 2001 (gmt 0)|
robots.txt is essential to my site... I've got areas of the site I don't want spidered, and robots.txt is the only way to prevent 'good' spiders (like googlebot) from indexing those areas, while still allowing them to index the rest of the material.
robots.txt DOES work, IF:
|1. You have all of your statements formatted correctly. Yesterday I had a spider plow through an area I *thought* was blocked, but since I had the line blocking that area written incorrectly, it didn't work. |
After emailing the spider's owner (antarcti.ca), determining the problem and fixing it, the terrifically nice folks at antarcti.ca's tech dept. sent their spider through again, and my robots.txt worked like a charm.
|2. The robot in question follows robots.txt conventions. All of the major search engines and important/good spiders DO follow robots.txt instructions... |
Any robot I find that doesn't request a robots.txt file, or ignores *properly formatted* directions therein, is banned form my site via htaccess, and loud complaints are sent to its owner.
| 7:55 pm on May 10, 2001 (gmt 0)|
Thanks mivox for the input.
Now how do I go by exactly trying to write a robots.txt that I know will work. Also if any of ya'll have a web resource that you think is very discriptive please post it so that I may take a look at it.
| 8:49 pm on May 10, 2001 (gmt 0)|
Welcome to WebmasterWorld circuitjump. Why not try SearchEngineWorld's own Robots.txt Tutorial [searchengineworld.com] It's a great resource. Also, try:
Robots.txt Validator [searchengineworld.com]
Robots Exclusion Meta Tag [searchengineworld.com] Using robots metatags.
Robots.txt : The Big Crawl [searchengineworld.com]We recently spidered 2million robots.txt files and found a surprising number of problems.
Robots Exclusion Standard rfc4 [info.webcrawler.com].
Root of Robots Exclusion Standard [info.webcrawler.com] directory with some interesting files.
Search Indexing Robots and Robots.txt [searchtools.com] article at searchtools.com.
| 12:03 am on May 11, 2001 (gmt 0)|
You could also look at mine here [absak.com], since it's just gone through the wringer and gotten all fixed up...
| 2:35 pm on May 11, 2001 (gmt 0)|
Thank you all