Forum Moderators: phranque

Message Too Old, No Replies

Blocking traffic from a specific site

Need help

         

icedowl

2:52 am on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have been getting some unwanted traffic. I have come up with the following where "example.com" is the source of this unwanted traffic. Instead of a 403 error code being issued, the code below issues a 500 error instead. I've been all through the Apache documentation for version 2.2 and still don't see where I'm going wrong.

RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://www.example.com.*$ [NC]
RewriteRule .* - [F,L]

wilderness

3:31 am on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This should work.
Please keep in mind there referrer based denials are not fool-proof.

RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://www\.example\.com/ [NC]
RewriteRule .* - [F,L]

jdMorgan

3:47 am on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You'll likely need to exclude the URL-path to your custom 403 error page, using either a RewriteCond or a negative-match pattern in the RewriteRule itself. Otherwise, you'll get an 'infinite loop' as the server tries to serve your 403 error page, and discovers that the referrer is still the unwelcome 'example.com'... repeatedly, with the result being a 500-Server Error when the server reaches its maximum internal redirection limit.

RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(www\.)?example\.com/ [NC]
RewriteRule !^path-to-custom-403-page\.html - [F]

Note that this simple fix will only work completely if your custom 403 page contains/includes no external objects -- no images, no CSS, no external JS. If your 403 page does include those things, then you'll either need to exclude their URLs from the access-control rule as well, remove those objects from the page to eliminate the dependency, or replicate them into a directory that you're willing to allow unconditional access to during 403 error-handling.

Also note that [L] used with [F] is redundant, so I removed the [L].

[added] Another alternative is to *not* use a custom error page for 403 errors, but instead use the server's default error message by removing any/all "ErrorDocument 403" declaration directives from your server config and .htaccess files. [/added]

Jim

icedowl

3:55 am on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thank you both. I currently do not have a custom 403 page and I would rather use the server's default page for this. I'll make the changes and see how that goes.

icedowl

5:23 am on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I'm at a loss. It's still giving the 500 error code. Maybe I should just give up on this, at least for tonight. I've commented out the code.

jdMorgan

12:55 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



When you get a 500-Server Error, check the server error log file -- It often gives very specific information -or at least a very strong hint- as to what the problem is.

It may be a s simple as needing to add:


Options +FollowSymLinks

ahead of the code above.

However, if that's the case, then that means you likely don't have any other rules already in place, and that in turn implies that you likely have several domain- and URL-canonicalization 'holes' in your site, making it prone to duplicate-content problems...

Jim

g1smd

12:56 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What was your final code?

WE can't debug what we can't see.

icedowl

3:58 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here is the final code as I left it last night.

RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://(www\.)?example\.com/ [NC]
RewriteRule .* - [F]

There are other rules in place but I can try adding Options +FollowSymLinks to this section. The site doesn't have any duplicate content problems although I've "been there ~ done that" a couple of years ago and know full well what a pain that is.

I may not get back to this for a few hours. I just got home from work and have this unrelenting need to sleep. Thanks again.

jdMorgan

4:06 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you already have other working rules in the file, then you won't need either the "Options +FollowSymLinks" or a duplicate/redundant "RewriteEngine on" directive in this "section."

Do take a look at your server error log file. Doing so may turn a three-day thread into a 30-minute thread.

Jim

g1smd

4:35 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The site doesn't have any duplicate content problems.

No?
- www and non-www
- index name vs. /
- appended port numbers
- etc

All fixed?

icedowl

9:00 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do take a look at your server error log file.

I'd love to but there seems to be a problem with the log files. I've just submitted a ticket to my hosting company and they are already working on it. I've got a feeling that the files from yesterday are lost forever.

www and non-www
- index name vs. /
- appended port numbers

All fixed. 301 from non-www to www is in place and is working. 301 from index.html to / is in place and is working. No appended port numbers exist that I'm aware of.

icedowl

10:42 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've decided to just drop this project for now. It may have been just a one or two day flood of visitors from the useless site since I haven't seen any of them today.

By the way, the log files were fine. I just need to find a better product to open and read them with. The '500' errors weren't showing in the error logs at all, I thought for sure that they should show there at the same time I could see them in the latest visitors report. I need to have another 500 error to occur so I know for sure.

jdMorgan

1:51 pm on Oct 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> No appended port numbers exist that I'm aware of.

Until I, as a malicious competitor, go out and get lots of links for you that include them... Try typing http://www.google.com:443/ into your address bar, for example.

> I've decided to just drop this project for now.

:( :( :(

> It may have been just a one or two day flood of visitors from the useless site since I haven't seen any of them today.

The same code technique can be used to deal with log spammers, and the fact that you're having any trouble at all with it is in itself a good reason to continue debugging. It would be good to find out why something so trivial doesnt work, as it may point to a deeper (i.e. more important) problem.

> I've got a feeling that the files from yesterday are lost forever.
> I need to have another 500 error to occur so I know for sure.

Neither of these is much of a problem... Simply put the 'bad code' back on the server and test again.

If the idea of intentionally putting known-bad code on your server tends to give you the heebie-jeebies, you can always prefix any 'suspect' rule you want to test with a RewriteCond that only allows the rule to run if the request comes from you (i.e. from your own workstation's IP address or a subnet that includes, say, 256 addresses):


RewriteCond %{REMOTE_ADDR} =123.45.67.89
-or-
RewriteCond %{REMOTE_ADDR} ^123\.45\.67\.

If you have several rules in a contiguous block that are suspect, but cannot test them one-at-a-time using the above method because you know the problem is caused by their interaction, then you can always add a 'skip rule' ahead of them, and skip over all of them unless the request comes from your own workstation. See mod_rewrite [S=nnn] flag.

Note that adding these safeguards protects only against rule execution errors. If there is a syntax error in the code, then it will still affect all users.

Jim

icedowl

6:54 pm on Oct 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Jim. I'll not drop the project although it will no longer be high on my list of things to do. I can take it on at a more leisurely pace. I will say that I was rather freaked out by the sudden flood of visits from that site, but they have stopped coming for now.

The idea of using my own IP for testing is great! However, I plan to do further testing on another one of my sites that doesn't get much traffic so I won't have to wade through so much to see the results. I've bookmarked this thread.