Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- How to ban (compatible ; type requests


wilderness - 9:45 pm on Jun 23, 2006 (gmt 0)


I like the simplicity of:
SetEnvIf User-Agent "compatible ;" keep_out

Where EXACTLY, and how do I put that line? -Larry

Larry,
Including the UA phrase in quotes results in an EXACTLY as.

As far as the line itself?
What I provided is somewhat incomplete.

I use both SetEnvIf (I can never recall the module name, even though I used it before I even began with Rewrite condition) and Rewrite conditions in my htaccess.

Even the examples that I provided three years ago:
[webmasterworld.com...]
do NOT provide the complete and necessary lines for operation.

You may use the condition provided today in your regular Rewrites.
EX:
Add line
RewriteCond %{HTTP_USER_AGENT} "compatible ;" [NC,OR]

or you may add the followling lines to your htaccess:
(when using the SetEnvIf or deny from the visitors are denied without access to robots.txt, although I seem to recall somebody writing an exception that allows reading)

Options -Indexes
<Limit GET>
SetEnvIf User-Agent "compatible ;" keep_out
order allow,deny
deny from #*$!.xx.xxx.
deny from xx.xxx.xx.xxx
allow from all
deny from env=keep_out
</Limit>

Some additional notes!
The opening and closing lines of

Options -Indexes (and other requirements)
<Limit GET> (with other options possible beyond GET)
</Limit>

vary from host to host.
It may require some tinkering.
I suggest non-peak hours till you see that it works properly.

Also you may use any words you desire as opposed to "keep_out"
The KEY is that you MUST use the identical words in both the deny from statements and SetEnvIf statements as WELL as your closing deny from env=

Don


Thread source:: http://www.webmasterworld.com/search_engine_spiders/3309.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com