Forum Moderators: open

Message Too Old, No Replies

YesupBot - No soup for you!

Bot or real visitor?

         

caribguy

10:27 pm on Nov 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Other than the obvious 86-ing, can someone enlighten me about Clicksor / YesupBot?

Attempted to login, using credentials from an Optimum online user who registered in April using a (verified) Yahoo email address... Unsuccessful only because the login URL that was used was all lower case. The account has now been disabled.

UA: "Mozilla/5.0 (compatible; YesupBot/1.0; +http://www.yesup.net/bot.html)" The content on their robots page is rather terse: "Web crawler by Clicksor.com advertising network. If you have any questions, please feel free to contact publisher@clicksor.com"

The UA generated:
4 hits from 3 ip addresses in the 38.99.186.nnn range on [26/Nov/2008:14:25:nn -0600]
1 more from the same range on [27/Nov/2008:02:23:08 -0600]
another 15, also same range on [27/Nov/2008:10:3x:xx -0600]
and at the same time, one hit from 66.48.78.nnn

Note: There is no advertising -at all- on this website.

wilderness

4:27 pm on Nov 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



4 hits from 3 ip addresses in the 38.99.186.nnn range

Do yourself a favor and deny all this providers [google.com] ranges.

caribguy

7:13 pm on Nov 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



All of 38.0.0.0/8 ?

I've got this now, the all pop no fizzle bot:
^(38\.99\.13\.12[1-6]¦38\.99\.44\.10[1-6]

another well known pest from the same range:
38\.105\.83\.([0-9]¦[1-9][0-9]¦1[0-1][0-9]¦12[0-7])

and the latest critter:
¦38\.99\.186\.([0-9]¦[1-9][0-9]¦1[0-1][0-9]¦12[0-7])

Edit, nevermind :) Should have used the search first...
[google.com...]

Thanks Wilderness!

caribguy

9:32 pm on Nov 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Spoke too soon! Looks like not all of PSI is bad, these are legitimate visitors for my sites:

38.101.20.0/24 - Tampa General Hospital
38.105.67.128/26 - Washingtonian Magazine

Any suggestions to improve on the following illegible puke? Is this even valid: 10[02-46-9] ?

RewriteCond %{REMOTE_ADDR} ^(38\.([0-9]¦[1-9][0-9]¦1[02-46-9][0-9]¦2[0-4][0-9]¦25[0-5])) [OR] # 38. except 38.101 and 38.105

and the next RewriteCond to exclude 38.101 and 38.105 not part of the above /24 and /26

wilderness

12:14 am on Nov 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



38.101.20.0/24 - Tampa General Hospital
38.105.67.128/26 - Washingtonian Magazine

Neither of the above "Names" have regsitered sub-nets through ARIN, or in the Class A (38) range.

Any suggestions to improve on the following illegible puke? Is this even valid: 10[02-46-9] ?

Not sure what this is? It is NOT valid syntax.
Perhaps you may provide more deatil as to what IP's your attempting to deal with?

exclude 38.101 and 38.105

#Deny 38-all and allow 101 & 105
RewriteCond %{REMOTE_ADDR} ^38\.([0-9]¦[1-9][0-9]¦100¦10[234]¦1[6-9][0-9]¦2[0-5][0-9])\.

Please NOTE Corrections required (replace) before use for forum breaking of pipe characters.

As an aside; you need to determine WHY (?) this charcater is either being inserted in your forum entries and/or your text syntax. It's non-valid and will prevent your syntax from functioning.

edited by wilderness.
Remnoved leading and incorrect parentheses.

wilderness

1:01 am on Nov 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've butchered this.
My sincere apologies.

Correct line should read:

#Deny 38-all and allow 101 & 105
RewriteCond %{REMOTE_ADDR} ^38\.([0-9]¦[1-9][0-9]¦100¦10[234]¦10[6-9]¦1[1-9][0-9]¦2[0-5][0-9])\.

dstiles

1:24 am on Nov 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Most stuff from 38.*.*.* is at least annoying if not dangerous but I've allowed two IPs in the range 38.113.234.* for VOYAGER/KOSMIX which seems to be useful for a few of my UK sites.

keyplyr

9:53 am on Nov 30, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've got a site in the PSI range and even I block much of it.

caribguy

3:16 am on Dec 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks again!

You're right about those not being visible thorugh ARIN, had to query rwhois.cogentco.com directly.

Meanwhile, I've found a bunch of others in the 38.n.n.n range that I don't want to be blocked: BBC World, a Top 10 US law firm, a school district, etc. Seems I've got some work cut out for myself...

incrediBILL

4:04 am on Dec 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



a school district

Slightly off topic but I loathe school districts...

Some little twerp hit my site for the same page about 14K times in about 15 min the other day and started a cascade failure with CPU intensive pages because of all the resources being used.

Server started burping and coughing, send out some alarms, but was still wheezing along when I managed to get to the console and block their IP in the firewall.

Sometimes even Apache burns too much resources during a DDOS, and I'm running a dual XEON box, sheesh!

[edited by: incrediBILL at 4:05 am (utc) on Dec. 3, 2008]

dstiles

8:40 pm on Dec 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's difficult to deal with schools on our server. Some web sites need the traffic but we have a couple of non-school sites that are hit by schools as part of the curriculum.

Most are behind firewalls or proxies run by a handful of industry-oriented ISPs. Many of them are reasonably well behaved but we've had to implement site-specific blocks on a few high-speed scrapers that seem to take every page going, even when blocked.

Not sure if this is deliberate policy by the ISPs managing the proxies or under the control of the users - there are some damn canny kids out there now!

Samizdata

9:10 pm on Dec 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not sure if this is deliberate policy

At the risk of straying even further off-topic, don't forget to factor in the filtering software that a lot of these institutions will be using - they can be more out of control than the kids.

...