Forum Moderators: open
It appears to disallow all the missagua hits, but Larbin sails right through ..
RewriteEngine On
RewriteCond %{HTTP_HOST}!^www\.mysite\.net [NC]
RewriteRule ^(.*)$ [mysite.net...] [R=301,L]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^Missigua [NC]
RewriteRule .* - [F]
- - -
Am I missing a wildcard symbol or something like that?
Is 'larbin' case-sensitive? How do I get around that if so?
Are both Larbin and Missigua considered USER AGENTS, or am I specifing the wrong field?
Can somebody do a quick and dirty fix?
I'm weak at this, and afraid to change things without advise.
If you copy the entire short contents back with fixes it will be far easier for me.
Then I can go ahead and add some other'jewels' to my sh** list. -Larry
I don't know a thing about .htaccess files. I can tell you about user agents.
Larbin does not always identify itself as larbin. In the last three months I've seen such variations as:
larbin_2.6.3
Mozilla/5.0 (larbin@unspecified.mail)
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Larbin/2.6.3
larbin_extended
Hopefully whoever winds up providing you with a fix will know about the variations on larbin and take them into account.
Making the change should block all variations of larbin in a UA.
When you take out the start anchor from "^larbin" the "larbin" that gets a 403 is only a lowercase "larbin". To match "larbin" or "Larbin" anywhere in a UA string you have to use the NC (No Case) flag at the end of the rewritecond.
RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
OK, lets say I use the two lines
RewriteCond %{HTTP_USER_AGENT} "larbin" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Missigua [NC]
Note that I changed ^larbin to "larbin" in quotes,
does that ban ALL larbins?
regardless of position in long complex string, and even AFTER Mozilla X.X yadda yadda?
Even the larbin@nobodyhome phony email part?
I want every kind of larbin/Larbin out of here.
I also added the NC in [NC,OR] to cover upper and lower case.
Much appreciated! -Larry
RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
I'd recommend 403ing them and then only redirect the 'good' user-agents that pass your access control restrictions.
Jim
RewriteEngine On
# disallow weenies
RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Missigua [NC]
RewriteRule .* - [F]
# redirect non-www to www.
RewriteCond %{HTTP_HOST}!^www\.mysite\.net [NC]
RewriteRule ^(.*)$ [mysite.net...] [R=301,L]
Note: Removed quote marks around "larbin" (thanks Span)
Re-ordered rewrite rules so larbin and missagua, case insensitive are disallowed BEFORE redirecting to www.
Added # comments for clarity. Is this OK as written?
Are blank lines for easy reading OK also? Or should those have the # character too? -Larry
...changing the order of your two rulesets. After all, why waste bandwidth redirecting bad user-agents? - jdMorgan
Anyhow, I went ahead and implemented the new .htaccess.
Soon I will see if I canned the various 'larbin' critters.
Next comes Java/1.4.1_04, which sucked down half my site in seconds.
That's on the boards as an email spammer fishing for addresses or the like.
I'm not ready to ban Jakarta-Commons yet, that supposedly has legitimate uses.
New question! One highly suspicious agent comes in as 'Microsoft URL Control' or some such.
Just spaces between the words, no dashes or anything like that.
I'm told NOT to put UAs in quotes. Does that mean a ban would look like this?
RewriteCond %{HTTP_USER_AGENT} Microsoft URL Control [NC,OR]
-- or should I go ahead with the quotes like this?
RewriteCond %{HTTP_USER_AGENT} "Microsoft URL Control" [NC,OR]
One guy out there noticed lots of phony hits with legitimate looking UAs except for one thing:
Instead of Mozilla 4.0 compatible; , he was seeing Mozilla 4.0 compatible ;
[ NOTE the added space before the semicolon ; !]
Would THAT be worth banning, or is it too risky?
I don't want to ban legitimate traffic, I want visitors to see my stuff. -Larry
Sorry for all the questions. -Larry
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ URL\ Control [NC,OR]
There is a start anchor ^ because this string always looks like this and you have to escape the spaces with the backwards slashes.
You can learn a lot by looking up UA strings in Google. Or browse through Andreas Staeding's database of UAs: [psychedelix.com ]
And yes, there are a lot of strange Mozilla UAs out there. The example from your post is definitely not a normal user. But you really should only ban UAs that you've actually seen on your site. Don't copy, or you are banning UAs that will never visit your site or that no one uses anymore..
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3\.0\(compatible\)$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3\.0\ \(compatible\)$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla\(IE\ Compatible\)$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^mozilla\ 4\.0$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible;\ MSIE\ 4\.00;\)$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible\)$ [NC,OR]
Earlier you wrote:
" Microsoft URL Control is a spambot. Ban it like this:
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ URL\ Control [NC,OR]
"There is a start anchor ^ because this string always looks like this and you have to escape the spaces with the backwards slashes.
- - -
Now Span, I have a another ugly crtter.
This one comes into my logs as 'forum.XYZ.nl'
XYZ (exemplified) hotlinks several images. Their particular forum page
is a million lines long, mostly scraped stuff from my and other sites.
I'm sure G and y have sent them to 'supplemental' purgatory long ago,
but they have so many visitors its driving me nuts.
Here's the issue: I tried the following line to ban the whole site:
RewriteCond %{HTTP_USER_AGENT} forum.XYZ.nl [NC,OR] # 13DEC05
.. and it didn't work. The Dutchmen come bombing right through.
So now, I am wondering about your mention of the \ (escape) character.
Is it the dots (.) which are screwing this up?
The XYZ part is sufficiently unlikely, something like 'FOK'
but I prefer to ban very selectively.
In short, do I need to 'escape' DOTS just like you said I should do with spaces?
Drunken and Dizzy in Redwood City aka -Larry
# 13DEC05
RewriteCond %{HTTP_REFERER} forum\.XYZ\.nl [NC,OR]
Jim