Forum Moderators: phranque
NuSearch Spider (compatible; MSIE 6.0)
It's either NuSearch or MSIE - not both!
So I want to deny useragents that include two words as penality. I'm 100% sure I want to do this but I only have some sort of idea what to do...
This is just an uneducated guess...
RewriteCond %{HTTP_USER_AGENT} NuSearch & MSIE
RewriteRule .* - [F,L]
Is the amperstamp the AND operator?
John
NuSearch Spider (compatible; MSIE 6.0)It's either NuSearch or MSIE - not both!
It doesn't claim to be both. It states (in an approximation of the formal language of User-agent strings, as originally defined by Netscape [mozilla.org]) that it's name is NuSearch, and that it is compatible with MSIE 6.0, meaning it can handle any markup that MSIE 6.0 can handle.
I think you'll find that the new Googlebot is Mozilla-compatibile, so consider this before banning these 'split personality' User-agents based on "compatible."
This particular format is questionable, since the User-agent is supposed to follow the name of the thing it's claiming to be compatible with, but there are plenty of legitimate spiders that mix up this syntax (Their authors should click on the link above).
I'm not saying you shouldn't block this UA, but be very careful of generalizing to block anything using this format.
To answer your coding question, "&" is not interpreted as an AND function in mod_rewrite, since it's often used as a delimter between query string name/value pairs. To perform the AND function, use two RewriteConds:
RewriteCond %{HTTP_USER_AGENT} ^NuSearch\ Spider
RewriteCond %{HTTP_USER_AGENT} MSIE
RewriteRule .* - [F]
If the sub-strings are always in the same order, then you can just code that into the pattern itself:
RewriteCond %{HTTP_USER_AGENT} ^NuSearch\ Spider.+MSIE
RewriteRule .* - [F]
RewriteCond %{HTTP_USER_AGENT} ^NuSearch
RewriteRule .* - [F]
The spider does not annoy me, it's the useragent that annoys me!
I don't care what is compatible with what when I'm looking at my statistics and have to wonder what ...misguided useragents are tainting my statistics in regards to browsers and spiders. Sure I can create filters but that will always require manually doing so until some sort of standard is established for useragents.
John