Sanenet

msg:1527695 | 2:01 pm on Jun 16, 2005 (gmt 0) |
* doesn't work as a wildcard in robots.txt the syntax to disallow a directory is simply: UserAgent:* Disallow:/store/scripts/ Which would block everybody from indexing anything in scripts. Disallow: /store/scripts/emailFriend.asp* Disallow: /store/scripts/contactUs.asp?emailSubject* Is either going to be ignored, or will lead to just those two pages being ignored. www.robotstxt.org
|
jimsthoughts

msg:1527696 | 8:28 pm on Jun 16, 2005 (gmt 0) |
| Is either going to be ignored, or will lead to just those two pages being ignored. |
| or could it wipe out the whole directory under that page?
|
Sanenet

msg:1527697 | 10:20 pm on Jun 16, 2005 (gmt 0) |
It COULD... but it shouldn't. * Should be ignored according to specs.
|
Reid

msg:1527698 | 3:54 pm on Jun 21, 2005 (gmt 0) |
a wildcard at the end of a line is pointless. Disallow: /store/scripts/emailFriend.asp* is no different than Disallow: /store/scripts/emailFriend.asp since any string matching "/store/scripts/emailFriend.asp" will be disallowed anyway. Only googlebot (and a few select others) allow a wildcard in the disallow line, this should be directed only at specific robots. You should never use a wildcard in the disallow feild of user-agent: * This robots.txt would cause an error for all bots except googlebot and for googlebot it would be pointless since any characters after the end of the query string are included in a match anyway.
|
|