How does robot.txt work?

Forum Moderators: goodroi

Message Too Old, No Replies

How does robot.txt work?

Acternaweb

4:01 pm on Feb 23, 2001 (gmt 0)

In our head content we have among other content the followign for robots:
<meta name="Robots" content="index">
<meta name="Robots" content="follow">

Looking at the files in Net Tracker, the SE's robots come to robot.txt, but for a short time.
My question is, how do they index the site? What exactly are the spiders looking for in the robots.txt file?

Let me know if you need more clarification

Thanks,

Paul

Acternaweb

4:07 pm on Feb 23, 2001 (gmt 0)

Sorry forgot to add, this is what the
robots.txt file currently has:

User-Agent: *
Disallow: *_private*
Disallow: *_vti*

Thanks,
PG

Hope

4:26 pm on Feb 23, 2001 (gmt 0)

I think you need to take a good hard look at this site.

[info.webcrawler.com...]

This the best information you are going to find on robots.txt.

WebGuerrilla

2:59 am on Feb 28, 2001 (gmt 0)

"In our head content we have among other content the followign for robots:
<meta name="Robots" content="index">
<meta name="Robots" content="follow">"

These meta tags probably don't cause any harm, but they have absolutely no effect when it comes to how often your site gets spidered, or how many pages get crawled. Some engines do honor the noindex meta tag, but others will ignore it. All credible engines honor the robots.txt. That is why you will see so many requests for it in your logs. Before a spider begins to crawl, it will check that file to see if there are any pages it shouldn't index.