Welcome to WebmasterWorld Guest from 18.104.22.168 , register , free tools , login , search , subscribe , help , library , announcements , recent posts , open posts Subscribe to WebmasterWorld
Can I disallow an entire directory but one file? too many files to list, plus I'd be advertising them walkman msg:1526243 5:06 am on Feb 24, 2004 (gmt 0) and some are best if left unlisted. However, I'd like to have search engines follow the links in one file from that directory.
is there a way, other than listing them oen by one?
closed msg:1526244 8:14 pm on Feb 25, 2004 (gmt 0)
Yeah. You could have a Disallow line for the directory, then an Allow line right after that for the file. I'm fairly sure Googlebot does that. You'd have to do your research to see which robots support Allow statements that way, though. tschild msg:1526245 8:59 pm on Feb 25, 2004 (gmt 0)
Another way (that does not use the nonstandard Allow directive) would be to use the property of Disallow directives to match any file whose path begins with the specified term.
Example: your files are
/directory/b2345yy.html /directory/b2433zz.html /directory/c8768aa.html
You want to block spidering of all those files except for directory/b2433zz.html
User-agent: * Disallow: /directory/a Disallow: /directory/b23 Disallow: /directory/c
This disallows all other files without specifying their full path.
walkman msg:1526246 9:21 pm on Feb 25, 2004 (gmt 0)
No need for a * at the end? Just the first letter of two and the entire file /directory is excluded? I'll do a through z and just leave the o out. That file starts with it.
This seems like a nice workaround it. The Allow directive was not validated.
BarkerJr msg:1526247 3:43 am on Feb 26, 2004 (gmt 0)
No, don't use an asterisk anywhere unless you have the asterisk character in the filename. Most spiders do not support wildcards anywhere in robots.txt (too expensive?). The asterisk in the useragent is not a wildcard, it's just a character that represents all spiders. walkman msg:1526248 4:13 am on Feb 26, 2004 (gmt 0)
thank you. Done. I have blocked /dir/a to /dir/z but the one I need. All in the correct format of course....validated it and everything. Thanks again for your help.