Forum Moderators: phranque

Message Too Old, No Replies

A Close to perfect .htaccess ban list - Part 2

         

adriaant

11:46 pm on May 14, 2003 (gmt 0)

10+ Year Member



<modnote>
continued from [webmasterworld.com...]



UGH, bad typo in my original post. Here's the better version (I wasn't able to re-edit the older post?):

I'm trying to ban sites by domain name, since there are recently lots of reference spammers.

I have, for example, the rule:

RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*stuff.*\.com/.*$ [NC]
RewriteRule ^.*$ - [F,L]

which should ban any sites containing the word "stuff"
www.stuff.com
www.whatkindofstuff.com
www.some-other-stuff.com

and so on.

However, it is not working, so I am sure I did not setup a proper pattern match rule. Anyone care to advise?

[edited by: jatar_k at 5:06 am (utc) on May 20, 2003]

Wizcrafts

2:44 am on May 20, 2003 (gmt 0)

10+ Year Member



I have been following this forum from page one through 17, and am using the tips in it to block many bad User-agents in my .htaccess file. However, I just ran into a User-agent that shows the following name; "-" (hyphen),in the raw web logs, but "<undefined>" in a hitcounter log for my trap page. I inserted <undefined> into my .htaccess filter, but a harvester slipped by it. What rule should I use in my RewriteCond rules to block the undefined (-)agent? Here is what I am thinking of adding, but want someone to confirm its correctness or otherwise:
RewriteCond %{HTTP_USER_AGENT} ^\-$ [OR]
TIA
Wiz

Tamsy

4:32 am on May 20, 2003 (gmt 0)

10+ Year Member



Hi WIZ

I am using

RewriteCond %{HTTP_REFERER} ^-?$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^-?$ [NC]
RewriteRule .* - [F,L]

in combination to block only cases where the UA AND Referer are empty "-" to avoid blocking innocent visitors (i.e. who are using an old version of Norton Internet Security which hides the UA).

Works well for me :-)

Wizcrafts

4:56 am on May 20, 2003 (gmt 0)

10+ Year Member



Tamsy;

Thanx for the reply. I first used Sam Spade to do a lookup on the IP in question, to be sure it resided somewhere that would not usually have business with me. It is based in Hong Kong, at this IP: 203.194.146.175, which is listed in several blacklists. The agent only indexed my entry page and my guestbook. I have seen a couple of similar entries in the past, with an <undefined> user agent id, so I will be adding your regexp for a referrer and user-agent of ^-$ .

jdMorgan

5:20 am on May 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



adriaant,

What you have should work. Cleaning it up and adding the "support" stuff to the beginning:


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(www\.)?.*stuff.*\.com [NC]
RewriteRule .* - [F]

If you already have the Options and RewritEngine on directives in your file, then how is it not working? Not blocking the visits? Server errors? etc.

Jim

adriaant

5:32 am on May 20, 2003 (gmt 0)

10+ Year Member



jdMorgan, yes, I have Options and RewriteEngine both on top of the htaccess file.

The reason I thought it is not working is that I had a hard time blocking one nasty spammer. The spammer always produces the following kind of access_log entries:

216.169.111.198 - - [18/May/2003:07:51:20 -0400] "GET / HTTP/1.1" 403 210 "http://www.some-bad-word-and-more.com\r" "http://www.www.some-bad-word-and-more.com\r"

I blocked first by ip:

RewriteCond %{REMOTE_ADDR} ^216\.169\.111\.
RewriteRule ^.*$ - [F,L]

but it still let them in (?), so I tried blocking by word as in my previous email.

Yesterday, I traced the IP address down and found it's hosting company. I complained there and they said they would do something about it. I yet have to see.

Btw, would this also work to block specific urls:

RewriteCond %{HTTP_REFERER} stuff [NC]
RewriteRule .* - [F,L]

jdMorgan

5:51 am on May 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



adriaant,

Yes, that would work, but [L] is redundant when used with [F].


RewriteCond %{HTTP_REFERER} stuff [NC]
RewriteRule .* - [F]

In this case, there is not much difference, but do try where possible to use anchored patterns; they can be tested much faster than unanchored patterns.

Notice that I took the "/" off the end of your "stuff.*\.com" above. The reason for this is that there might be a port number appended to the domain, in which case the character following ".com" won't be a "/" and this may be why your rule did not stop him.

Jim

adriaant

6:09 am on May 20, 2003 (gmt 0)

10+ Year Member



Jim, thanks for the quick replies, very much appreciated!

I modified the htaccess file according to your directions. I was suprised,though, that the [F,L] part should be written as [F], since I see the former in so many .htaccess file samples (such as Mark Pilgrim's one here [diveintomark.org])

Adriaan

[edited by: Woz at 11:36 am (utc) on May 20, 2003]
[edit reason] shortened URL [/edit]

Wizcrafts

6:13 am on May 20, 2003 (gmt 0)

10+ Year Member



JD;
Why is the L (Last command) redundant here? Please clarify for us, as we have seen it used so many times.

Tamsy

11:16 am on May 20, 2003 (gmt 0)

10+ Year Member



Hi WIZ

I generally disallow search engine queries directly to the guestbook on our server (who searches for guestbooks via search engine?). To accomplish this I use the following rule:

Snip
RewriteCond %{HTTP_REFERER} q=guestbook [NC,OR]
Snip

Maybe this helps further ;-)

This 122 message thread spans 13 pages: 122