Forum Moderators: open
XX.XXX.69.189 - [R_HST] - [H_REF] - msnbot/1.0 (+http://search.msn.com/msnbot.htm)
That IP/UA ignored all robots.txt instructions applicable to all bots, as well as msnbot-specific instructions and "MSN Search Web Crawler and Site Indexing" Site Owner specs [search.msn.com].
Because my robots.txt is usually meticulously adhered to when IPs/hosts are Microsoft-related, I'm concerned that the IP/host either spoofed the UA, and/or intentionally overrode its rules.
I blocked the closest IP via mod_rewrite (XX.XXX.69.) and sent a Cease-and-Desist to abuse@, etc., but now I'm very wary of allowing the UA "msnbot" at all unless I also restrict it to msn.com or its IPs. (An overreaction, perhaps, but they got everything that wasn't nailed down or already UA-blocked by mod_rewrite.)
Thoughts? Ever see a spoofed msnbot?
63.223.69.189
That's part of Beyond The Network America, Inc. (63.216.0.0 - 63.223.255.255; btnaccess.com), wholly-owned subsidiary of PCCW Limited / PCCW Global (pccw.com). From what I understand, blocking any of those IPs is a bit like taking out a chunk of the world at the knees.
I don't expect a rapid response, if any. (My Cc'd C&D resulted in autoresponders from supportamerica at pccwbtn.com and abuseresponse at btnaccess.com.)
FWIW, the size of the non-Microsoft mother ship prompted me to wonder if msnbot's name had been spoofed or if what hit me was a legit, if rudely re-engineered, msnbot running under a corporate license.
Not that it matters, really, that I'm newly wary of the name, because an abusive bot by any name is a block-worthy pain.
From what I understand, blocking any of those IPs is a bit like taking out a chunk of the world at the knees.
I'm not so sure that I agree with your analysis.
go to ARIN and type in the following " > 63.216." (minus the quotes. Be sure to leave the blank space between the > and 63.
You'll see that some of their ranges are provided to colocators.
bull (Jan) also provided the following on November 7 of last year:
205.177.72.206 - - [07/Nov/2005:02:16:23 +0100] "GET /foobar.htm HTTP/1.1"
403 - "-" "discovery/0.5libwww-perl/5.803"
Best Don
I did the ARIN lookup and didn't see any 63.223. IP ranges. Also, I don't know what "bull (Jan)" is or refers to, sorry. (As you might imagine, permutations of "bull" via Google and wikipedia generate all kinds of sites and meanings:)
At the risk of veering waaay off-topic...
I was just told that blocking, say, 63.223., would add up to a heckuva lot of addresses because it's pretty much a math thing --
63.223.69.X: X = (0-255) => 256 addresses
63.223.XX.X: XX = (0-255) x X = (0-255) => 256 x 256 => 65,536 addresses (max.)
(Right?)
-- and in a global company, 65,536 addresses could add up to countries.
For example, an upstream SysAdmin recently mis-typed an Asian block and accidentally firewalled chunks of Asia, Australia and New Zealand.
So getting back to bot IDs...
Presuming the bot that hit us was fake-named, I've got some .htaccess files to update. Shoot.
I did the ARIN lookup and didn't see any 63.223. IP ranges.
What I previously advised you to check, returns a "subnet delegation" from ARIN for the ranges of that provider which begin at the range 63.216 then proceed until the subnet delagations from ARIN reach the end of the providers range.
Perhaps the remainder of their customers use dynamic IP's instead of fixed IP ranges?
In any event, 63.216.#*$!.xxxx to 63.223.xxx.xxx is a fairly large range not to have any customers shown in a network that obviously sells.
Especially today when major internet providers in North America are placing great urgency in dividing once-former large IP ranges into smaller localized ranges.
Also, I don't know what "bull (Jan)" is or refers to, sorry.
Jan is a participant in this forum. His screename is "bull".
Nobody in this forum is capable of advising you what is best for your website (s).
ONLY you have the capability of deciding what is beneficial and what is detrimental.
All I was saying is that what you percieved as a "large range"
63.216.xxx.xxxx to 63.223.xxx.xxx
Is really not that large.
It all depends on the market capability you may possibly derive from the range.
That's part of Beyond The Network America, Inc. (63.216.0.0 - 63.223.255.255; btnaccess.com)
deny from 63.216.0.0/13
But wait ... maybe ... is it a cloaking hunt?
The purpose of my post was simply to report/describe a bot apparently spoofing "msnbot" and inquire if Microsoft ever licensed same.
I guess that last part is still unknown.
The purpose of my post was simply to report/describe a bot apparently spoofing "msnbot" and inquire if Microsoft ever licensed same.I guess that last part is still unknown.
MSN Search Siteowner Support: [support.msn.com ]
I guess that last part is still unknown.
Good luck with any confirmation from MS!
In early 2003 when somebody from MS began crawling sites anonymously ( [webmasterworld.com...] )
the participants in this forum never did veridy their identity.
There were a couple of other simialr threads regarding MS around that time as well.