Page is a not externally linkable
Busynut - 1:06 am on Nov 4, 2002 (gmt 0)
I need to study regular expressions because every time I think I'm getting it I look at all the characters/symbols and get confused all over again. [begin childish rant]I do allow Google and everyone else visit/spider/cache any of my pages in the "top" section of my site. This section basically just contains explanatory info for the rest of the site. However, there are certain directories I don't want anyone to cache, save, etc. I just want them to be viewed by legitimate visitors. See, certain sections of my site have become moderately popular due to a good deal of positive attention from an About.com guide - it's a humorous graphic section (all family friendly, I assure you!). Although I'm enjoying the popularity, I've begun to see my images crop up all kinds of places... and they didn't get there through legitimate visitors to MY site. I don't want any of the image search engines caching these pages... they're welcome to cache the page that 'explains' what my site contains... but if people are going to use my graphics I'd rather they get them from my site legitimately, if you know what I mean. It seems a losing battle at times, and clearly I'm way behind in implementing effective techniques. [end of rant] Back to the htaccess. I implemented your suggestion above (msg 9) And it seems to work fine EXCEPT while pretending to be bad bots I was still able to view pages in the lower directory (which contained the hotlinking rewrite rule). So... I seem to have fixed that for the time being by just putting the same list of bad bots in both htaccess files - the only difference in the files is the lower one contains the hotlinking rule. Also, I got rid of the extra 403 page (the explanatory one) and am just using a 403.html page (I included the explanation there). But I'm still getting the error: So I'm still doing something wrong and I'm not going to give up on this because I'm very stubborn :) even if it means reading htaccess, mod-rewrite, and regex rules until I'm blurry eyed. I'll study your last suggestion (msg 11), but it's definitely okay with me that no search engine have access to certain directories on my site. I'm very grateful for your help! (can I please adopt you?)
[sigh]
ErrorDocument 403 /403.htm
RewriteCond %{HTTP_USER_AGENT} <list of bad bots>
RewriteRule!^(403.*\.htm¦robots\.txt)$ - [F,L]
Additionally, a 403 Forbidden
error was encountered while trying to use an ErrorDocument to handle the request.