Forum Moderators: phranque

Message Too Old, No Replies

Blocking Email Harvesters

         

keyplyr

6:13 am on Jul 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




I was successfully sing this to block some of the email harvesters:

RewriteCond %{HTTP_USER_AGENT} E-?mail [NC,OR]

However, today I saw that it blocks this user who's IP has installed some kind of anti-spam software:

***.**.***.*** - - [16/Jul/2004:05:58:35 -0700] "GET / HTTP/1.1" 403 198 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; EmailProtect; .NET CLR 1.1.4322)"

Anyway to allow that UA while still maintaining a generic email block? Thanks.

jdMorgan

1:35 pm on Jul 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



... works for me:

# Block e-mail harvesters but not emailprotect
RewriteCond %{HTTP_USER_AGENT} E-?mail [NC]
RewriteCond %{HTTP_USER_AGENT} !EmailProtect
RewriteRule .* - [F]
#
# BLOCK various exploit attempts
RewriteRule (form¦mail¦contact¦feedback¦sender¦tell).*\.(cgi¦pl¦php)$ - [NC,F]

Jim

pmkpmk

1:55 pm on Jul 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In my humble opinion, it's a race you could not win. I gave up that race some time ago...

See first post on this [webmasterworld.com...] page.

jdMorgan

2:23 pm on Jul 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On the other hand, some Webmasters want to try, and we try to answer their questions here...

Interestingly, I've found that blocking the worst of the bad user-agents and harvester requests results in a greatly-reduced number of attempts over time.

As one recent example, I was working with a new domain early this year. Apparently, the IP address assigned to this new domain was previously-used, and might have been a "wide-open" unsecured server. As soon as I had access to the account, I noticed thousands of exploit attempts per day: harvesting attempts, proxy-forwarding requests, form-mail attempts, the works. So I put up my "standard" .htaccess file for this kind of account, and over the next three months, as those exploit attempts failed due to 403 responses, the number of attempts dropped to zero. By that time, I had some content ready to put up, so the site started off with nice, clean log files.

This was the most dramatic case I've ever seen, but helps to confirm my suspicion that there are lists of unprotected sites that are shared by troublemakers, and you don't want to be on that list.

I don't think it's worthwhile to try to ban every single strange user-agent you find in your logs -- You won't have time to do anything else if you follow that path. But I do believe it is worthwhile to block the worst of the worst -- the ones that actually affect your site by frequently using up bandwidth and polluting your log files. I also consider it my responsibility to make sure that any server I control cannot be used as an open proxy. Some others do more and some do less, and many have very good reasons for their choices. So, put me right in the middle of the road on this issue... :)

Jim

pmkpmk

2:39 pm on Jul 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't get me wrong, jdMorgan, in now way am I trying to "sell" my way as the only one possible. My point is that trying to sort ALL of the good-guys from the bad guys takes too much time one could better spend on other things like getting backlinks or working on the content.

I don't think it's worthwhile to try to ban every single strange user-agent you find in your logs -- You won't have time to do anything else if you follow that path. But I do believe it is worthwhile to block the worst of the worst.

I absolutely agree! I didn't say my htaccess file is empty! Only yesterday I blocked a notorious log-spammer from the Czech Republic via htaccess. But the thread I'm referring to claims to have a "close to perfect" solution - and this closeness is very time-consuming.

And when it comes to blocking EMAIL-Harvesters, there's imho better ways to do so than htaccess. At least for me.