Forum Moderators: goodroi
I admit that this question is probably a little paranoid, but then I am paranoid about screwing with my robots.txt file and stopping those nice robots from spidering my site.
Q. Is it permissible to Disallow one sub-directory within a directory that one wants to have spidered?
For example: I recently added an 'admin' directory within a directory called 'reference'.
I want robots to continue spidering 'reference' but to exclude 'admin', by using the following
User-agent: *
Disallow: /reference/admin/
However, I'm afraid that if 'reference' even appears in the path of any disallowed directory then it, too, will be excluded. Will it ... or will the above exclude 'admin' only?
Can anyone help with this? I can't find any Examples in tutorials that list more than one directory level.
Thanks
Finlay
This example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/", except the robot called "cybermapper":# robots.txt for [example.com...]
User-agent: *
Disallow: /cyberworld/map/ # This is an infinite virtual URL space# Cybermapper knows where to go.
User-agent: cybermapper
Disallow:
Disallow: /reference/admin/
This should disallow only urls that start with /reference/admin/, and /reference/ itself will be okay.
Here's part of mine:
User-agent: *
Disallow: /cart
Disallow: /contact
Disallow: /photos/detail.php
Disallow: /browse/result.php
/browse/results.php is disallowed, but /browse/view.php is indexed.