Forum Moderators: goodroi
In the mean time you can take a look at WebmasterWorld's own
Robots Checker [searchengineworld.com]
It validates a robots.txt file and has some nice info on robots-txt.
This can be a full path, or a partial path; any URL that starts with this value will not be retrieved. For example, Disallow: /help disallows both /help.html and /help/index.html, whereas Disallow: /help/ would disallow /help/index.html but allow /help.html
useragent *
Disallow: /this.htm
That would only get:
/this.htm
But would also get:
/this.htm/rocks
The problem is, that I don't believe all spiders follow the spec that way. They tend to do sliding regexs.