Forum Moderators: goodroi

Message Too Old, No Replies

directory

         

sylvieg

3:40 pm on Jan 4, 2006 (gmt 0)

10+ Year Member



When I want to exclude a directory dir that can be at the root level or deeper. Ex: /dir/, /dir2/dir/, /dir3/dir/
DO I have to put
Disallow: /dir/
or
Disallow: dir/
or?
Thanks for the help

Dijkgraaf

10:11 am on Jan 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The second on is wrong, all URL's have to start with /
It always has to be from the root level as it disallows "URL's starting with"

So the following are what is needed for your examples
Disallow: /dir
Disallow: /dir2/dir
Diaallow: /dir3/dir

Note that I didn't put a trailing slash.
Some web servers will actually server up the default page in that directory for a request of
GET /dir2/dir
and some of the less smart bots would not match this to the rule
Disallow: /dir2/dir/
So it is safer not to have the trailing slash.

sylvieg

12:02 pm on Jan 6, 2006 (gmt 0)

10+ Year Member



Thanks for the reply - and the trailing slash trick
As I am in the context of a software that you can install in any directory
do you think I can use
Disallow:/*/dir

And BTW, I am wondering if the same rule applies to file
Does
Disallow:myfile.html
Will disallow every file /*/myfile.html

Dijkgraaf

4:38 am on Jan 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, you can't use wildcards like that.
They aren't part of the robots.txt standards.
In fact the standards say, no wild cards at all, but some search engines will support wildcards for file extensions, but none that I know of would accept correctly interpret the wild cards they way you have them.