Welcome to WebmasterWorld Guest from 3.227.240.31

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Is it possible to configure htaccess to allow controlled page views?

     
7:49 pm on Dec 11, 2011 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 18, 2007
posts:133
votes: 0


Is it possible to configure htaccess to allow a visitor from a 'unique IP' to visit a set number of pages before he is blocked?

I have a site with around 100 pages, but in the past few days I have been bombarded by rogue IPs each having more than 1000 page views. So they probably access the same page over and over again increasing server load and bandwidth.

I have blocked a range of rogue user-agents in htaccess to no avail. I have even blocked IPs originating from Spam producing countries like China. No use, cause all these IPs seem to originate in the US and are all valid with verified DNS.

So I was thinking, if there is a way of letting a visitor view a set number of pages, say around 50 pages/visit/24hours. And once he reaches that limit, he is automatically blocked. Any possibility of doing that using htaccess?
9:41 pm on Dec 11, 2011 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3274
votes: 19


Can't help with htaccess but are you certain the pages are all yours? See my note re: PHP scans in your other thread.

You should be able to block anything that has bad header fields (eg User-Agent but there are several others). Also, if your site does not use a certain file extension (eg PHP) then block those. Etc.
10:15 pm on Dec 11, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5507
votes: 5


I have blocked a range of rogue user-agents in htaccess to no avail.


(FunWebProducts|Trident)

Have you tried denying access to this one, which is rampant in your listing of UA's?
These are compromised machines, using a compromised tool bar.
9:05 am on Dec 12, 2011 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 18, 2007
posts:133
votes: 0


I did block funwebproducts, but did not have much success. This is what I did, hope this is correct, cause even after I blocked Funwebproducts, I was still getting traffic from them:

RewriteCond %{HTTP_USER_AGENT} ^FunWebProducts [NC,OR]

Trident seems like a legit token identifier according to the microsoft website, with Trident/5.0 representing 'Internet Explorer 9'.

But oddly enough, a major number of 'blank referrer' hits I am getting is from this User string:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

And according to the microsoft website, this is a legit string. Could it be that the user agent is spoofed?

[edited by: spiritualseo at 9:42 am (utc) on Dec 12, 2011]

9:36 am on Dec 12, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15869
votes: 869


Whoops!

^FunWebProducts

means

"FunWebProducts" has to be the very first thing in the UA. The word

[OR]

had better mean that you are going to name a second condition-- unless your real motive is to check out your custom 500 page. (Been there. Done that.) If the second Condition is UA Trident, you can achieve the same thing by

RewriteCond %{HTTP_USER_AGENT} (FunWebProducts|Trident)

with no anchors. That's what wilderness meant.

Could it be that the user agent is spoofed?

Easily :( Just yesterday my site-- possibly even the whole server conglomeration-- got hit by a botnet all claiming to be

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0

which is about as legitimate as they come.

[edited by: lucy24 at 9:38 am (utc) on Dec 12, 2011]

9:38 am on Dec 12, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5507
votes: 5


#UA begins with FunWebProducts
RewriteCond %{HTTP_USER_AGENT} ^FunWebProducts [NC,OR]

which none of your UA's do same.

Change to

#UA contains FunWebProducts
RewriteCond %{HTTP_USER_AGENT} FunWebProducts [NC,OR]

However and IMO, you'll not lose enough visitors to matter by incluing Trident:

RewriteCond %{HTTP_USER_AGENT} (FunWebProducts|Trident) [NC,OR]

In reflection, all the UA's you provided also utilized

Windows NT 6.1

You could also use this as a multiple condition to lessen innocents further (eliminating the NC which will never happen for either of the UA's:

#UA contains both 6.1 and FunWeb or Trident
RewriteCond %{HTTP_USER_AGENT} Windows\ NT\ 6\.1
RewriteCond %{HTTP_USER_AGENT} (FunWebProducts|Trident)

The possibilities are endless.
Another:

#UA ends with Trident 5.0
RewriteCond %{HTTP_USER_AGENT} Trident/5\.0\)$

another option:
"contains 6.1" and ends with "Trident 5.0"
10:08 am on Dec 12, 2011 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 18, 2007
posts:133
votes: 0


Thanks for that! I added FunWebProducts to htacesss as follows and removed the 'empty referrer' block, but got flooded again in a matter of seconds:

RewriteCond %{HTTP_USER_AGENT} FunWebProducts [NC,OR]

And the user agents this time are all over the place:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; GTB7.2; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30618; WinNT-PAI 26.08.2009; AskTbLMW2/5.13.2.19379)

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.24) Gecko/20111103 Firefox/3.6.24 GTB7.1

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; BO1IE8_v1;ENUS; InfoPath.1)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; MS-RTC EA 2)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SearchToolbar 1.2; GTB7.2; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C; AskTbPF/5.13.1.18107)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; WinNT-PAI 13.07.2009; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; BRI/2)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; SearchToolbar 1.2; .NET CLR 1.1.4322)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SIMBAR={EAB88C4A-74E2-48D9-B98A-F879BAD74A04}; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30618)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; AskTbWCL2/5.13.1.18107)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Media Center PC 3.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; patch:05689)

Blocked 'empty referrer' again and then recorded some legit user agents which had the string trident and windows NT in them. Here are a few that had legit visitors referred by search engines:

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

Also the IPs are all over the place and I am not able to figure out a IP range to block cause they are all very unique. Almost all 4 characters unique.

So for the time being, I don't think I have a choice than to keep blocking the empty referrer!
10:48 am on Dec 12, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15869
votes: 869


Well, you can definitely lock out MSIE 6. I keep a small loophole that goes like this:

RewriteCond %{HTTP_USER_AGENT} MSIE\ [56]\.\d
RewriteCond %{HTTP_REFERER} !\?
RewriteRule (\.html|/)$ goaway.html [L]

See if you can figure out how it works.

/goaway.html is a special page that includes a link to an obfuscated e-mail address for the use of humans with very, very old browsers. (They exist in my very, very narrow niche ;) so I can't just exclude them) It's a lightweight page, even smaller than your ordinary 404 or 403. Notice that the rule only applies to requests for pages; only a human would ask for the associated images and style sheets, and then only if they're already on the page.