Forum Moderators: phranque
This is what I have..
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} MS.*FrontPage [NC]
RewriteRule!^403.*\.html$ - [F]
RewriteCond %{REMOTE_HOST} \.example\.com$
RewriteRule .* [mydomain.com...] [R=301,L]
<Limit GET>
order allow,deny
deny from #*$!.xxxx.xxx.xxx
deny from yyy.yyy.yyy.yyy
allow from all
</Limit>
thanks again group
I wouldn't bother redirecting \.example.com to "banned.shtml". Bad bots don't follow redirects, as a rule. Just 403 them like your first code block does.
Also, be aware that testing REMOTE_HOST and using a URL-path pattern of .* will have potentially-serious performance impact on your server. This is because you are asking your server to do a reverse-DNS lookup on every single resource requested from your server. If possible, look up the IP address range associated with \.example.com, and block/redirect them by IP address %{REMOTE_ADDR} instead of by hostname.
Are you saying that you are still getting slammed by one of the IP addresses in your "deny from" code at the bottom? If so, are you sure they are doing "GET" requests? You have set up your code so that only GET requests are restricted.
Jim
Just to show that it is in my htaccess file.
Maybe I could ban IP address... but I think it would be exhaustive as if you were to try to ban say, aol users from Dallas Texas
The situation is occurring thru my mod-rewrite "banning."
the host/persons I am banning is something like this
\.the\.group\.i\.ban\.example\.com thus it's less global than my \.example\.com
My concern was that maybe I didn't include a closing argument in my mod rewrite and the process was running some sort of endless loop. And as there are not any suggestions pertaining to that idea from the example I posted... I am guessing that is not the problem (hopefully) .
I had a similar problem
[webmasterworld.com...] a year ago
ending up totaling 1 million hits .
In that case the individual was using the search engine MSN and their host was a big name cable company that uses google to do their search engine (ie.. you go to big name company's website and they have a search box at the top of the page that gives google results). So.. I don't believe it was a bad bot from google but a maybe somebody trying to obfuscate the search engine or a script kiddie trying to break into my hosting company's servers and my website.
This time it is a similar situation, but they are using google... that is in my log the referer looks like this
yahdaha yahdah yahdah "http://www.google.com/search?hl=en&q=bozotheclown+big-shoes" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; AT&T WNS5.0; YComp 5.0.0.0; .NET CLR 1.1.4322)"
and that will result in big chunk of my log.
So obviously that's not a bot .
It "feels" like a script or somebody taping a fishing weight down on the F5 refresh key .
So...... if my mod rewrite is good , then it is not something I have generated by mistake in the htaccess file.
Thanks so much again for your comments and time on this issue, as I expect I will not get any quantitative explanations from the offending company, g, or my host. I didn't last year with the similar incident and MSN and big name cable company.
Thanks again.
The following code will loop, because there is no provision to allow 'banned.shtml' to be served, so after you do the redirect and the client returns asking for 'banned.shtml', it will get redirected again.
There are several ways to fix this, and also several ways to improve it. I'll show them cumulatively, but you can mix and match the techniques if desired:
# Prevent 'infinite' redirect loop
RewriteCond %{REMOTE_HOST} \.example\.com
RewriteRule !^banned\.shtml$ http://www.mydomain.com/banned.shtml [R=301,L]
# Hide the banning function from the client by using internal rewrite instead of redirect
RewriteCond %{REMOTE_HOST} \.example\.com
RewriteRule !^banned\.shtml$ /banned.shtml [L]
# Reduce horrible DNS lookup load on server by limiting check to only shtml pages
RewriteCond %{REQUEST_URI} !^/banned\.shtml$
RewriteCond %{REMOTE_HOST} \.example\.com
RewriteRule \.shtml$ /banned.shtml [L]
# 403 unwelcome hosts instead of serving banned.shtml
RewriteCond %{REMOTE_HOST} \.example\.com
RewriteRule \.shtml$ - [F]
In addition to increasing the server load by a factor of at least two, checking reverse DNS also incurs an additional dependency: If the DNS server you're querying is slow or broken, then your site will be slow or broken. As it is now, every incoming request to your server for every page, image, stylesheet, etc. results in an outgoing RNDS lookup request, and the incoming request cannot be served until the RDNS response comes back. You may indeed have to live with this for awhile, but at least be aware of the severity of the load increase, so you can balance it with your access control needs.
Also, be aware that if you use a custom 403 error page, then it will also need to be excluded from access control in a similar way to that shown for excluding banned.shtml, in order to avoid a looping situation.
HTH,
Jim
I would often check the logs to see if my redirects were taking place... so I don't understand how I would have missed other endless loops taking place.
And yes... HTH.
thanks again.