Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- How to ban (compatible ; type requests


wilderness - 3:07 pm on Jun 24, 2006 (gmt 0)


* Are you sure about the full quotes around the "compatible ;" user agent?
* Will this ban ANY request the includes that?
* How about the space? Isn't an escape character needed as in compatible\ ;?
* A very similar exclusion has the carat ^ symbol first, as in
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ URL\ Control [NC,OR]

Does the carat indicate 'INCLUDES', and wouldn't I need that as well?

Larry the line works exactly as provided. I've been using it for some years. The are multiple examples of this in the Perfect Htaccess threads.

The carat (^) means BEGINS without [without the parentheses.]
When using begins with, ONLY the unique leading characters are necessary. The full UA is a waste.

The dollar ($) means ENDS without [without the parentheses.]
When using ends with, ONLY the unique trailing characters are necessary. The full UA is a waste.

NO leading or trailing character means CONTAINS [any location within the UA]
When using contains, ONLY the unique keyword characters are necessary. The full UA is a waste.

Surrounding a string in quotes is very similar to the deprecated html <pre></pre> in that the string will be compared EXACTLY as typed.

These four options make the entire procedures quite simple.
One may handle most UA's with these options using SetEnVIf or RewriteCond (whatever your preference, or any combination of the two.)

Jim and many others are able to provide complicated strings and expressions to convert unknown phrases and/or characters in a string.
I do NOT uses any of these types of strings, nor do I understand them. And yet I have no difficulty in creating unique lines for the UA's that keep appearing.


As far as the escape (\) character?
In UA's (and SetEnvIf) I likley have used it in less than than a handful of lines (with over 400 lines of UA's). (I suppose eventually
I'll need to condense these 400+lines of UA's and move them to Rewrite and using the OR pipe character, however I've grown accustomed to looking at them alaphabetically.)

Using the escape character for RewriteCond IP ranges is an entriely different issues and it must be used prior to period that separates every CLASS of an IP range.


Thread source:: http://www.webmasterworld.com/search_engine_spiders/3309.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com