Forum Moderators: open

Message Too Old, No Replies

User agent switching

         

btherl

10:15 pm on Jun 4, 2012 (gmt 0)

10+ Year Member



216.8.179.xx - user agents look ok, but it hits too many domains. Is it a bot? Let's count how many hits there are from each user agent:

 100 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; FDM; .NET CLR 2.0.50727; InfoPath.2; .NET CLR 1.1.4322)
80 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 1.1.4322; InfoPath.2; .NET CLR 3.5.21022)
88 Mozilla/5.0 (Macintosh; PPC Mac OS X; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0 Opera 10.00
87 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_1; zh-CN) AppleWebKit/530.19.2 (KHTML, like Gecko) Version/4.0.2 Safari/530.19
88 Mozilla/5.0 (Macintosh; U; PPC Mac OS X; de-de) AppleWebKit/412.6 (KHTML, like Gecko) Safari/412.2
88 Mozilla/5.0 (Windows; U; Windows NT 5.1; es-AR; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11
79 Mozilla/5.0 (Windows; U; Windows NT 6.1; da) AppleWebKit/522.15.5 (KHTML, like Gecko) Version/3.0.3 Safari/522.15.5
84 Mozilla/5.0 (Windows; U; Windows NT 6.1; ja; rv:1.9.2a1pre) Gecko/20090403 Firefox/3.6a1pre
83 Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2
84 Opera/9.80 (Windows NT 6.0; U; fi) Presto/2.2.0 Version/10.00


Each UA was used between 79 and 100 times - it's a list of 10 hard-coded user agents, being chosen at random with equal weighting.

I wonder when someone will use actual traffic to make a weighted UA list for their bot? Or maybe they already have? :)

wilderness

12:56 am on Jun 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is a server farm.
216.8.128.0 - 216.8.191.255

Why give them access at all?

btherl

1:39 am on Jun 5, 2012 (gmt 0)

10+ Year Member



They are an ISP as well, so blocking the entire /18 isn't the way to go. We get legitimate traffic from there.

wilderness

2:30 am on Jun 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{REMOTE_ADDR} ^216\.8\.1(2[89]|[3-8][0-9]|9[01])\.
RewriteCond %{REMOTE_ADDR} ^!216\.8\.ISP
RewriteRule .* - [F]

keyplyr

7:08 am on Jun 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




I block Next Dimension which is a colo and hosting company:

216.8.176.0 - 216.8.179.255
216.8.176.0/22

Is it a bot?

Probably not in the sense that other actual SE bots are. There are many GET tools that switch UA. The user just loads up the fields with whatever UAs he/she likes.

wilderness

11:51 am on Jun 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FWIW, the correct syntax is:

RewriteCond %{REMOTE_ADDR} !^216\.8\.ISP

dupres01

1:33 pm on Jun 10, 2012 (gmt 0)

10+ Year Member



"RewriteCond %{REMOTE_ADDR} !^216\.8\.ISP"

i think i like this solution, but i don't fully understand it. can you explain the use of .ISP? Thanks.

wilderness

2:30 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not aware of the range you wished an exception for and thus, I used the letters in its place.

The leading exclamation, and this line, when used in conjunction with the other larger range, allows access for this range.

wilderness

2:58 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FWIW, the preference of using mod_rewrite to deny IP's and UA's is more efficient and more versatile than when using mod_authz.

Mod_rewrite allows the webmaster to create multiple conditions that are not CPU-server intensive, unless (of course) your sites are primarily large database driven.

g1smd

4:14 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For this line
RewriteCond %{REMOTE_ADDR} !^216\.8\.ISP

where .ISP is replaced with the relevant digits for that one ISP.

[edited by: g1smd at 4:36 pm (utc) on Jun 10, 2012]

wilderness

4:28 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If this is the range that is the desire exception to allow access?
216.8.179.xx

Than the lines would read:

# deny Class C range, with Class D exception
RewriteCond %{REMOTE_ADDR} ^216\.8\.1(2[89]|[3-8][0-9]|9[01])\.
RewriteCond %{REMOTE_ADDR} !^216\.8\.179\.xx$
RewriteRule .* - [F]

keyplyr

4:42 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month






FWIW, the preference of using mod_rewrite to deny IP's and UA's is more efficient and more versatile than when using mod_authz.

Mod_rewrite allows the webmaster to create multiple conditions that are not CPU-server intensive, unless (of course) your sites are primarily large database driven.


Actually, while efficient and more versatil, using mod_rewrite to control IP & UA access with multiple conditions is much more server intensive than a simple statement with mod_authz.

I used to use mod_rewrite to block all IPs. I now use mod_authz (with a few exceptions that I still use mod_rewrite with.)

wilderness

4:47 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To expand on the versatility of mod_rewrite for custom solutions?

You could require multiple conditions be met, and while still allowing the finely-tuned excepted range.

1) Header; I'm not real sharp on headers and another may modify this more correctly. It's just used here as an example of implementing multiple condition.
2) UA contains "crap"

3) Require conditions 1 & 2 and deny from Class C

4) Class D IP excepted from all conditions.

RewriteCond %{HTTP:Accept-Encoding} !gzip,\ deflate
RewriteCond %{HTTP_USER_AGENT} crap [NC]
RewriteCond %{REMOTE_ADDR} ^216\.8\.1(2[89]|[3-8][0-9]|9[01])\.
RewriteCond %{REMOTE_ADDR} !^216\.8\.179\.xx$
RewriteRule .* - [F]

wilderness

4:56 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Actually, while efficient and more versatil, using mod_rewrite to control IP & UA access with multiple conditions is much more server intensive than a simple statement with mod_authz.


Mod_rewrite allows the webmaster to create multiple conditions that are not CPU-server intensive, unless (of course) your sites are primarily large database driven.


keyplr,
I'm positive that I've a larger htaccess (more lines) in place than any other participant of this forum, and in place for more than a decade.

My sites are simple.
No scripts, no Java, no PHP and no MySQL.

With the exception of some occasional syntax typos, my large htaccess doesn't interfere with my hosts CPU limitations, nor are my pages loading slowly.

As with all other website issues, each webmaster must determine what is beneficial or detrimental to their own site (s).
There's no one-size-fits-all.

I've recently explored a unique method for a custom solution (thanks to Key_Master) and despite the solution functioning as intended, the solution does not offer the same versatility of something similar in mod_rewrite.

g1smd

5:31 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



With no PHP or mySQL, you'll have plenty of spare processing power to accomodate larger then usual mod_rewrite processing.

Seriously, I couldn't get by without PHP - even for basic sites it's used for "including" common content and for various customisations.

wilderness

6:05 pm on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



g1smd,
My focus as a webmaster has required expanding in directions of learning that I never anticipated.
Even today, I will stick with deprecated html, rather than taking the time to grasp a new method.

Besides, widget folks could care less ;)