Forum Moderators: phranque

Message Too Old, No Replies

Using starting anchors with banned UAs?

         

keyplyr

11:06 am on Feb 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I keep finding that UAs I've banned are getting through because the UA name was somewhere other than the far left of the string.

RewriteCond %{HTTP_USER_AGENT} ^Downloader [NC,OR]

I remove the starting anchor (^) and they are blocked, at least in my tests.

RewriteCond %{HTTP_USER_AGENT} Downloader [NC,OR]

So I'm wondering why I see all the "how to ban" examples with these anchors? What's the advantage of using the anchor? What would happen if I remove them all? Thanks.

jdMorgan

4:30 pm on Feb 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



keyplyr,

The problem is that many people see those start anchors on some UA's and assume that they mean something other than what they really mean. So, they copy them down the whole list for the sake of "neatness" or something, and never read the nice regular-expressions tutorial [etext.lib.virginia.edu] cited in many of our threads.

Start anchors (^) should only be used when you are sure that UA you wish to match starts with the pattern you provide. Otherwise, they should be omitted. Similarly, the end anchor ($) should only be used when the string to be matched ends with the specified pattern. If both anchors are used, an exact match is required.

The advantage of anchoring patterns is speed; an anchored pattern is *much* faster to process than an unanchored pattern. Using many unanchored patterns can cause a rather huge performance hit.

A good way to screen user-agent patterns for anchoring is to compare them against actual user-agent strings in your raw access logs, or even againt the raw access logs of other sites which publish them (accidentally or otherwise). "Stats" reports are useless for this activity, because the stats programs commonly strip off non-unique UA headers such as "Mozilla/n.n (compatible;" -- thus making them worthless for research purposes.

In the Close to perfect .htaccess ban list - Part 3 [webmasterworld.com] thread, claus posted a warning about cutting and pasting .htaccess code posted here and assuming that it is correct *and* that it fits your situation. Unfortunately, this anchoring problem is a good example of how it can be wrong.

Jim