Forum Moderators: phranque

Message Too Old, No Replies

Robots.txt and the wildcard character

         

grnidone

7:46 pm on Oct 3, 2001 (gmt 0)



Can you use the * to specify which files to disallow? Besides to disallow everything?

User-agent: PITA_bot
Disallow: /whatever/bla*/

User-agent: PITA_bot
Disallow: /whatever/bla/*

User-agent: PITA_bot
Disallow: /whatever/bla*

The top one says to PITA_bot "Don't index any directory which is in the whatever file which starts with 'Bla'...

It seems to me the second one says "Don't index dirs underneath /whatever/bla/" which would still mean the page which sits at:
[yada.com...]
would be spidered.

The third one ... ?

Brett_Tabke

6:56 am on Oct 4, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Wildcards are not supported, but wildcard like action is:
[searchengineworld.com...]

Disallow: The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved. For example, Disallow: /help disallows both /help.html and /help/index.html, whereas Disallow: /help/ would disallow /help/index.html but allow /help.html.