Forum Moderators: goodroi
What does Y! not understand in :
User-agent: *
Disallow: /somedirectory/
Anyone else seen this before ?
Also, make sure the syntax of your file is 100% correct; All comments on separate lines starting with "#", and one and only one blank line after each record (including the last one).
Also, it's possible that Slurp has not yet processed your new robots.txt file. I prefer to post a new robots.txt file at least 24 hours before adding any content that I don't want spidered.
None of this may be applicable -- Just taking some guesses based on what you posted.
I have noticed that Slurp tries to fetch indexes for directories in which it has found content. That is, if it finds a link to /pages/foo.html, it may try to fetch /pages/. On Apache servers with "Options -Indexes" set, this results in a 403-Forbidden response. Similarly-configured IIS servers probably do the same. However, Slurp does seem to honor robots.txt even when doing this -- I've only seen it when fetching pages in that directory is allowed, and the only strange thing about it is that it's trying to fetch a directory index which is not linked-to anywhere on the Web.
Jim
...you don't have any other records specific to Slurp, because if you do, it will honor only the first one.
the syntax of your file is 100% correct
...Slurp has not yet processed your new robots.txt file
...if it finds a link to /pages/foo.html, it may try to fetch /pages/