Welcome to WebmasterWorld Guest from 54.226.25.231

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt

How effective is it anyway

     

circuitjump

6:31 pm on May 10, 2001 (gmt 0)

10+ Year Member



Hi all,
Just want to find out what others think of robots.txt. the question is " Is robots.txt helpful or not? " I know we all talk about it and use it but I read posts here and elsewhere where people say yes it works and no it does not work. So what is it?
if anyone has any input, please let me know.
Thanks

mivox

6:44 pm on May 10, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



robots.txt is essential to my site... I've got areas of the site I don't want spidered, and robots.txt is the only way to prevent 'good' spiders (like googlebot) from indexing those areas, while still allowing them to index the rest of the material.

robots.txt DOES work, IF:

1. You have all of your statements formatted correctly. Yesterday I had a spider plow through an area I *thought* was blocked, but since I had the line blocking that area written incorrectly, it didn't work.

After emailing the spider's owner (antarcti.ca), determining the problem and fixing it, the terrifically nice folks at antarcti.ca's tech dept. sent their spider through again, and my robots.txt worked like a charm.

2. The robot in question follows robots.txt conventions. All of the major search engines and important/good spiders DO follow robots.txt instructions...

Any robot I find that doesn't request a robots.txt file, or ignores *properly formatted* directions therein, is banned form my site via htaccess, and loud complaints are sent to its owner. 

circuitjump

7:55 pm on May 10, 2001 (gmt 0)

10+ Year Member



Thanks mivox for the input.
Now how do I go by exactly trying to write a robots.txt that I know will work. Also if any of ya'll have a web resource that you think is very discriptive please post it so that I may take a look at it.

Thanks,
Circuitjump

physics

8:49 pm on May 10, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld circuitjump. Why not try SearchEngineWorld's own Robots.txt Tutorial [searchengineworld.com] It's a great resource. Also, try:

Robots.txt Validator [searchengineworld.com]

Robots Exclusion Meta Tag [searchengineworld.com] Using robots metatags.

Robots.txt : The Big Crawl [searchengineworld.com]We recently spidered 2million robots.txt files and found a surprising number of problems.

Robots Exclusion Standard rfc4 [info.webcrawler.com].

Root of Robots Exclusion Standard [info.webcrawler.com] directory with some interesting files.

Search Indexing Robots and Robots.txt [searchtools.com] article at searchtools.com.

Cheers :)

mivox

12:03 am on May 11, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



You could also look at mine here [absak.com], since it's just gone through the wringer and gotten all fixed up...

circuitjump

2:35 pm on May 11, 2001 (gmt 0)

10+ Year Member



Thank you all
 

Featured Threads

Hot Threads This Week

Hot Threads This Month