Forum Moderators: open
Went straight to a non cached page. No images, no robots. Caught 403's from denied ranges.
218.166.52.190 - - [03/Aug/2006:20:15:55 -0700] "GET /mypage.html HTTP/1.0" 403 - "-" "MVAClient"
60.248.164.58 - - [03/Aug/2006:20:15:59 -0700] "GET /SamePage.html HTTP/1.0" 403 - "-" "MVAClient"
60.248.166.136 - - [03/Aug/2006:20:16:15 -0700] "GET /SamePage.html HTTP/1.0" 403 - "-" "-"
60.248.165.226 - - [03/Aug/2006:20:16:24 -0700] "GET /SamePage.html HTTP/1.0" 403 - "-" "-"
Interesting reading.
Approximate MVA for client-server systems with nonpreemptive priority [csdl2.computer.org]
[edited by: volatilegx at 2:10 pm (utc) on Aug. 4, 2006]
[edit reason]
[1][edit reason] fixed side scrolling URL [/edit] [/edit][/1]
The browser agents they use are:
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Mozilla/4.0 (compatible; Trend Micro tmdr 1.2-1003)
My initial discussion of this can be found at [lunarforums.com...] The posts are somewhat long are were difficult to format, so they are not duplicated here.
My system checks out clean for spyware, except for one thing: I recently started using Trend Micro's PC-cillin Internet Security 2006, which has an AntiFraud Toolbar. I have always avoided such IE toolbars because most of them have "spy" reputations. This one would not report itself as spyware, but maybe it is doing some research on its own after all. I have emailed Trend to inquire if these IPs are related to them, and have not yet received a reply (only 1 day so far).
My current hypothesis is that the toolbar is recording the URLs I visit (including the "secret" ones on my site) and later visiting them via these crawlers. This is now almost the only reasonable explanation of how someone can find out about these secret files. Also, preliminary evidence is that these IPs do not come after secret files that I create and then never visit via HTTP. That test is ongoing.
So although I can't identify MVAClient for you, it looks to me like these IP addresses, whether malicious or not, are certainly not being "up front" about whatever it is they're up to. And they never bother with robots.txt. They're not really crawling anyway; they know exactly which files they want, and go straight for them.
[edited by: SteveWh at 4:35 am (utc) on Aug. 9, 2006]
60.248.0.0 - 60.248.255.255
This particular IP range in HINET isn't the worst I've seen, but HINET has some dangerous high speed scrapers which is why they are on my 'list' if you know what I mean.
Here are a couple of recent sightings in this range:
60.248.9.114 "NutchCVS/0.7.1 (Nutch; [lucene.apache.org...] nutch-agent@lucene.apache.org)"
60.248.166.136 "larbin_2.6.3 larbin2.6.3@unspecified.mail"
60.248.165.226 "larbin_2.6.3 larbin2.6.3@unspecified.mail"
60.248.166.136 "MVAClient"
60.248.165.34 "MVAClient"
Note that larbin and MVAClient have both crawled from 60.248.166.136
They also tried to hack my SSH from that range:
sshd:
Authentication Failures:
unknown (60-248-81-124.hinet-ip.hinet.net): 55 Time(s)
root (60-248-81-124.hinet-ip.hinet.net): 14 Time(s)
alias (60-248-81-124.hinet-ip.hinet.net): 1 Time(s)
apache (60-248-81-124.hinet-ip.hinet.net): 1 Time(s)
ftp (60-248-81-124.hinet-ip.hinet.net): 1 Time(s)
mysql (60-248-81-124.hinet-ip.hinet.net): 1 Time(s)
named (60-248-81-124.hinet-ip.hinet.net): 1 Time(s)
postgres (60-248-81-124.hinet-ip.hinet.net): 1 Time(s)
Not nice.
... deny the entire Class A.
If by that you mean "deny from 60." - that is a very dangerous general recommendation.
I live in Australia and some blocks under 60. are allocated to Australian businesses - notably Telstra clients (the national Telco). We would be cutting off some clients if we denied that whole block. This may also be true for NZ and Pacific island countries.
All this was to allow acces to an approimate 5% of my overall vistors.
At that time I did not even explore (or perahps I was unable to locate any Oceanic numbers in the 60-Class).
Today, the only changes that I'm making on my end (regarding Oceanic IP's) is to DENY ranges.
Each webmaster decides on his own what is benefical or detrimental to their own websites.
Had I to do over the three weeks I spent a few years back?
I would NOT today make that effort.
Rather, I'd wait for people within the widget industry to contact me and then make access adjustments.
Of course, if that's the reigon of the world that you reside?
Thank your outlook is entirely different than mine.
Don
Thank your outlook is entirely different than mine.
I fully understand and respect your personal outlook. I was pointing out that making an unqualified suggestion to block that Class A range is rather a dangerous thing to do. People from all over the world read this forum and sometimes quite blindly act on the advice.
With regard to blocking whole countries, on one of our sites I serve up different PHP includes after the incoming IP has been vetted by Maxmind GeoIP country lite (free version). Australia and NZ get one version (99.9% of customers come from there) and the rest of the world gets a different version.
As the IP blocks change very regularly, Maxmind issue a new DAT file once per month. It is a refined solution for what it appears you tried to do by hand.
[edited by: Mokita at 3:16 am (utc) on Aug. 10, 2006]
Not sure why you spent weeks sorting it out as the allocations are identified here:
[ftp.apnic.net...]
They ID the country they belong to so a little data crunching and you'll know what ranges belong to which counties and block them with ease.
I was pointing out that making an unqualified suggestion to block that Class A range is rather a dangerous thing to do. People from all over the world read this forum and sometimes quite blindly act on the advice.
Mokita,
If people from "all over the world" are silly enough to blindly take my suggestions without
spending the time to go through the archives of Webmaster World and determining that I'm quite severe in restricting visitors to non-North American IP ranges (and even more unforgiving with NA IP ranges, than, they deserve what they get.
"They" would not purchase any item without surveying it's worthiness!
So why should "they" make any less of an effort at SESI and take either you, myself or any other particiant as written word.
I learned some things about bots and htaccess prior to joining SESI.
And I still learn today.
However, I never make changes to my websites or my websites administration without exploring the consequences.