Welcome to WebmasterWorld Guest from 54.224.17.208

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

MSN's Stealth Missions

     
3:40 pm on Oct 8, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Reports of stealth and abuse by MSN/Bing --

MSN's many cloaked bots. [webmasterworld.com...]
MSN's many cloaked bots. Again. [webmasterworld.com...]

-- keep getting a little long in the page count so here we go, again, after a brief recap of top problems...

1.) Cloaked / bare (no rDNS) IPs from (partial listing):

65.52.
65.54.
65.55.
157.55.
207.46.

2.) Atypical hit patterns like this now-common 'no UA, no robots.txt, no referrer, 11-hits-to-same-file' visit from 65.52.33.73:

15:45:09/dir/filename.html
15:45:20/dir/filename.html
15:45:31/dir/filename.html
15:45:42/dir/filename.html
15:45:53/dir/filename.html
15:46:03/dir/filename.html
15:46:14/dir/filename.html
15:46:25/dir/filename.html
15:46:35/dir/filename.html
15:46:46/dir/filename.html
15:46:57/dir/filename.html

3.) Atypical, 'unofficial' UAs from .search.msn.com domains akin to this morning's visit from:

msnbot-207-46-204-157.search.msn.com
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648)

robots.txt? NO

Cloaked UAs include:

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0)
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; WOW64; Trident/5.0)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648)

robots.txt? NO

And last but not least, the ongoing oddity:

msnbot/2.0b (+http://search.msn.com/msnbot.htm)._
7:36 pm on Oct 8, 2011 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



A while ago bingdude visited the bing forum hereabouts and I mentioned this. He promised to get it looked into. No recent activity from him so we can only assume bing has begun taking on the google policy of popping in once and then departing for ever. :(
7:46 pm on Oct 8, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I was hoping this lot was going to resolved several months ago.

I am very close to blocking the whole lot for good. When I have some spare time the trigger will be pulled.
6:58 am on Oct 9, 2011 (gmt 0)

5+ Year Member



I banned the ones ending in )._ ages ago.

And I am already gradually banning the ones with 'unofficial' UAs, started with my high volume sites first.

Haven't decided about the "no rDNS" IPs yet.
3:51 pm on Oct 17, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



During last night's wee hours, a cloaked IP and basically a scraper UA:

207.46.92.16
Wget/1.10.2

robots.txt? NO

Absolutely not okay.
8:58 pm on Oct 18, 2011 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Further to my posting 8th Oct above:

On that day I stickied bingdude asking him to reappear. Seems like my genie-invocation spell failed: no reply from him nor has he been seen in the Bing forum for a long time. Rapped knuckles for being helpful?

Gone the way of all the google visitors. Sad, I was beginning to have hope! :(
11:05 pm on Oct 18, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Thank you for trying. But you know, in the major-SE scheme of things, I reckon we're but fleas on the rumps of elephants: insignificant, annoying, and dependent on the ride.
10:45 am on Oct 23, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



157.55.196.249
Firefox 7.0

robots.txt? NO
6:45 pm on Oct 24, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Ten minutes ago, again w/ Wget from a kin IP of the last Wget (207.46.92.16):

207.46.92.17
Wget/1.10.2

robots.txt? NO

Anyone else seeing any repeatedly/intentionally rogue UAs?
9:07 pm on Oct 24, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Yes, I see variations of those all the time and have for years. I've dismissed them long ago and they're filtered from what I actually spend time following-up on; maybe naively but nonetheless.

The flags I used to research would take 2-3 hours every morning. I now filter the usual suspects (defined or not) and only spend time on the actual threats. Got it down to about an hour now :)
7:30 pm on Nov 7, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



In a twitter swarm:

65.52.21.72
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)

robots.txt? NO
4:30 am on Nov 9, 2011 (gmt 0)



I'm getting the stealth visits too from microsoft:

207.46.204.162
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

robots.txt? NO

and

157.55.112.207
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

robots.txt? NO

Not acting anything like a bot.
4:26 am on Nov 13, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



This just in. Note the (in)famous UA. Am amazed they're still using it:

msnbot-157-55-39-84.search.msn.com
msnbot/2.0b (+http://search.msn.com/msnbot.htm)._

robots.txt? Yes
8:28 pm on Nov 13, 2011 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I still have a block on that UA. I thought they would have fixed it along with the DNS update but hadn't yet checked it. Ah, well.
10:26 pm on Nov 14, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



They're hammering my server right now using the following UA:
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko)"

More details here: [webmasterworld.com...]
2:18 pm on Nov 15, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Exact same IP and UA as reported on 11-07 above, but not post-tweet this time. I give.

65.52.21.72
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)

robots.txt? NO
7:00 pm on Nov 15, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Has anyone noticed the irony of AppleWebKit coming from MS's search.msn.com IPs?

Just thought I'd point it out in case someone wasn't paying attention ;)
8:03 pm on Nov 15, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




Has anyone noticed the irony of AppleWebKit coming from MS's search.msn.com IPs?

It's joined the ranks of Mozilla. AppleWebKit is even used in the UA string of Android (their arch rival.) Guess it comes down to who was there first gets to name the mountain.
7:38 pm on Nov 16, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sneaky little thing MSN, from a visit today

1n:24:33 /robots.txt - 157.55.17.192 - Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
1n:25:05 /dir1/page1.asp - 157.55.17.192 - Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
1n:25:07 /dir1/page1.asp - 207.46.204.164 - Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534~~(KHTML, like Gecko)

comes as bingbot and gets the page, two seconds later comes as "regular UA" and gets nothing.

Instead of using their resources on frivolities they should crawl as a proper bot and get peoples web sites indexed.

(ps : ~~ = double space)
8:54 pm on Nov 16, 2011 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I suspect the webkit UAs are doing something like google's web preview OR checking for viruses OR... Who knows? It's probably bot-ish but not pure bot.

Pity bingdude won't visit here. He's back in the Bing forum at present but for how long, who knows?
2:24 am on Dec 31, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Beats heck outta me what MSN was doing today with this laughable UA:

Mozilla/4.0 (compatible

msnbot-157-55-17-117.search.msn.com [projecthoneypot.org...]

1n:28:33 /dir/filename.html [302'd to...]
1n:28:34 /botbait/ [403]

robots.txt? NO
3:13 am on Dec 31, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



@ Pfui

Yup, I reported that a couple days ago:

[webmasterworld.com...]
2:47 pm on Jan 9, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



'No UA, no robots.txt, no referrer, 11-hits-to-same-file' visits from Microsoft's Dynamic Hosting IPs now:

NetName: MICROSOFT-DYNAMIC-HOSTING
NetRange: 70.37.0.0 - 70.37.191.255
CIDR: 70.37.0.0/17, 70.37.128.0/18

During the first week of Jan., two days apart:

70.37.161.240
-
04:19:33 /dir/filename20.html
04:19:45 /dir/filename20.html
04:19:56 /dir/filename20.html
04:20:07 /dir/filename20.html
04:20:19 /dir/filename20.html
04:20:30 /dir/filename20.html
04:20:42 /dir/filename20.html
04:20:53 /dir/filename20.html
04:21:04 /dir/filename20.html
04:21:16 /dir/filename20.html
04:21:27 /dir/filename20.html


70.37.162.57
-
21:35:43 /dir/filename41.html
21:35:54 /dir/filename41.html
21:36:06 /dir/filename41.html
21:36:17 /dir/filename41.html
21:36:29 /dir/filename41.html
21:36:40 /dir/filename41.html
21:36:51 /dir/filename41.html
21:37:03 /dir/filename41.html
21:37:14 /dir/filename41.html
21:37:25 /dir/filename41.html
21:37:37 /dir/filename41.html
8:01 pm on Jan 9, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



The MS equivalent of AWS? I've had the range 70.37.0.0 - 70.37.191.255 blocked for two years now.
12:22 am on Jan 10, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



If only blocks stopped them from coming...
10:45 pm on Jan 11, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Oh, come on:

65.55.67.169
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; MSN 9.0;MSN 9.1;MSN 9.6;MSN 10.0;MSN 10.2; MSNbMSNI; MSNmen-us; MSNcIA)

robots.txt? NO
Referrer? YES

The referrer was legit. But the hit from a bare-IP MSN IP? Beats heck outta me. Employee, maybe -- 403'd because MSN plays fast and loose with its hordes, and hoards, of cloaked bots.
8:01 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Came across a new (to me) MS IP range today...

204.231.192.0 - 204.231.223.255

One of the IPs was used as a proxy for an unspecified forwarding IP so the range could include proxies or it could be a "broadband" range (with local proxy/firewall). I could get no DNS information about the range, only the whois data.
4:48 am on Feb 20, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Continuing the theme of wtf-ness:

I'd assumed that in the course of January [webmasterworld.com] I got to know all the major robot players. Today during a routine check of Bing/MS IPs, which normally results in dead silence*, I ran smack into a pile of msnbots.

Nothing new about msnbot/2.0b-- and that's just the point. Its owners [onlinehelp.microsoft.com] say it's been put out to pasture, replaced by the bingbot.** The specialized msnbot-media is still on the job, but I had to go all the way back to May of 2011 for the last vanilla msnbot. What's up? Does the msnbot know something about the Social Security system that it's not telling? Was the MSN retirement package not all that it expected?

In the middle of the msnbots-- did it think it could hide?-- was a whole slew of msnbot-NewsBlogs (their plural). They too have been around for years; they're mentioned in assorted WebmasterWorld threads. I have never met one before. (Never = since April 2011 when I started saving raw logs.)

They made a total of 16 successful requests. Half were for robots.txt, always taken in pairs. The other half were for...

Let me backtrack here. For a long time I had one unusually fat file that was inordinately popular with the wrong kind of robots. It also got the occasional search-engine hit, most of them from humans who were clearly looking for something else. Wasted time and bandwidth on all sides. A couple weeks back I cut off the first 5% of the file and saved it under the name of the original fat version. The old one got tucked away behind a new name, a nofollow link and a noindex meta tag. If humans want to read the whole thing they're welcome. Robots can jolly well go on a diet.

The newly arrived blogbot read this slimmed-down file eight separate times.

The newly pulled-from-retirement msnbot puttered around here and there-- including a single serving of robots.txt-- presumably hoping I wouldn't notice when it, too, read the slim file twice... followed by the fat file.

Well, hey. It's not google. It doesn't have to pay attention to the "nofollow" directive. And the file's already indexed, so it's not like it's seeing anything it hasn't seen dozens of times before.


* Figure of speech. It's really the computer's "Bzzt!" sound meaning "Nope, nothing here." The bingbot and the msnbot-media have already been filtered out; the plainclothes bot is blocked.
** They also say, quote, "Bing does not share IP addresses for our crawlers." I'll trade you a 65\.5[2-5]\. for a 157\.(5[4-9]|60). Anyone got a spare 207\.46\.?
 

Featured Threads

Hot Threads This Week

Hot Threads This Month