homepage Welcome to WebmasterWorld Guest from 54.196.197.153
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
MSN Bot
wilderness




msg:4475684
 8:22 pm on Jul 14, 2012 (gmt 0)

Is this a still valid version?

msnbot/2.0b

 

lucy24




msg:4475717
 12:08 am on Jul 15, 2012 (gmt 0)

Do you have a sitemap? That seems to be all they use it for.

wilderness




msg:4475751
 2:15 am on Jul 15, 2012 (gmt 0)

lucy,
many thanks.
Generally speaking that may be their procedures, however the bot has followed a facebook link (grabbing that link and two other related pages, and then commencing to the site map).

It should be noted that these three initial pages are not exclusively listed in the sitemap, rather their parent pages.

I do have another sitemap (each individual page om my site), which I submitted in the Google Webmaster Tools, the other Major SE's offer fee based submissions, unlike Google.

FWIW, I don't participate in Facebook, nor do I allow Facebook referrals. (The referring links that Facebook provides are useless and to a generic page rather than the actual page.) Facebook is simply not beneficial to my site.
Additionally, any visitor that links to one of my pages on Facebook gets their IP denied for future visits.

My situation (content) is quite unique and I would NOT suggest these practices for other webmasters.

incrediBILL




msg:4475804
 6:08 am on Jul 15, 2012 (gmt 0)

I've noted one recent occurrence of that user agent lately from a single IP and it didn't ask for a sitemap.

I'm almost of the opinion it's pretty safe to ignore it these days and only accept Bingbot but since it doesn't cost any extra to support both, still do.

Additionally, any visitor that links to one of my pages on Facebook gets their IP denied for future visits.


That's hysterical.

However, I'm unclear on why anyone would deny Facebook traffic as social media can often drive better traffic than a search engine so you're in theory just irritating valid customers.

Fascinating topic, you should start a thread about it in the Facebook forum [webmasterworld.com] as this is a perspective I'm sure would be quite controversial.

Kind of like how I block all Twitter bots (leeches), except those from Twitter itself, but I don't block Twitter users as that would truly defeat the purpose.

keyplyr




msg:4475812
 8:17 am on Jul 15, 2012 (gmt 0)



I agree with Bill. I get double (often triple) digit human visits daily from FB. I even place the occasional incoming link myself, although I try to keep a quiet presence there and don't waste my time with it much.

My theory is to use social media but not be a part of it.

I actually keep a more consistent, active presence on Twitter although my tweets are scheduled from a cron with the occasional manual tweet. Great proliferation for my advert customers.

lucy24




msg:4475839
 12:59 pm on Jul 15, 2012 (gmt 0)

People who get their recommendations from Facebook are welcome to visit, but they can jolly well do it without hotlinking. Only problem is that when you don't let the facebookexternalhotlink robot pick up pictures, it keeps pounding on your door hour after hour like a blasted missionary. They give up eventually, but boy is it annoying.

I was thinking of rewriting it to my administrative gif. At 1x1 px transparent, it would be pretty hard for the original recommender to select one for future hotlinking. Unless, ahem, they randomly happened to click on just the right pixel in a seemingly empty screen. And it involves fewer bytes than either the no-hotlinks image or a curt 403.

But the human facebook users seem to be perfectly willing and able to recommend a site even without benefit of a little hotlinked picture.

wilderness




msg:4475848
 1:50 pm on Jul 15, 2012 (gmt 0)

However, I'm unclear on why anyone would deny Facebook traffic as social media can often drive better traffic than a search engine so you're in theory just irritating valid customers.


I agree with Bill. I get double (often triple) digit human visits daily from FB.


You guys must be extra-terrestrial?

How do you locate the actual page on FB that is linking to your page?
Without this insight (referring page) how do you determine if the link is detrimental or beneficial?

you should start a thread about it in the Facebook forum


Bill,
I've previously explained that I've no interest in FB or its traffic, why would I care to start a thread in that forum?
I may as well go over to the Apache Forum and start a thread on Joomal of which I've no interest in either.

wilderness




msg:4475850
 1:58 pm on Jul 15, 2012 (gmt 0)

People who get their recommendations from Facebook are welcome to visit, but they can jolly well do it without hotlinking.


lucy,
Just for clarification, I wasn't referring to image hot-linking (haven't allowed image hot-linking or image cache in more than a decade; i. e., msnbot-media ftom another thread), rather standard URL in-line linking.

keyplyr




msg:4475869
 3:59 pm on Jul 15, 2012 (gmt 0)

facebookexternalhotlink robot pick up pictures

There is no "facebookexternalhotlink robot" and if you truly believe this then you have not done your homework.

Just for clarification Lucy (and I'm positive I have outlined this in a prior thread you participated in) the FB platform does not allow any member to post a file not found in their account - can't be done, cannot hot-link to a file on another server.

Now, the FB member could come to your web site on their own using a web-browser and steal a file, save it to their machine, then upload it to their FB account, but how is that different from any other thief and how does it have anything to do with FB?

Dislike FB all you like, but let's be accurate here.

keyplyr




msg:4475870
 4:22 pm on Jul 15, 2012 (gmt 0)

Also, in response to the OT:

msnbot/2.0b (+http://search.msn.com/msnbot.htm)

Full crawls on 2 of my sites each and every single day. On the other sites I mostly see bingbot but still see msnbot/2.0b getting other than sitemap.xml.

wilderness




msg:4475871
 4:31 pm on Jul 15, 2012 (gmt 0)

many thanks keyplr.

I had a nagging recollection that beta (b) version was no longer valid.

incrediBILL




msg:4475882
 7:23 pm on Jul 15, 2012 (gmt 0)

I've previously explained that I've no interest in FB or its traffic, why would I care to start a thread in that forum?


LOL. fair enough.

However, whether YOU use facebook or not, sounds like people that visit your site do use it. I figured a thread about why to block facebook traffic with your POV would be interesting.

There is no "facebookexternalhotlink robot" and if you truly believe this then you have not done your homework.


I think it's this she refers to:

USER AGENT: "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
IP: 66.220.149.248
IP: 66.220.153.244
IP: 69.171.234.5
IP: 69.171.234.6

They also have something that goes by "Facebook share follower"

Back to the OT:

I wouldn't dismiss msnbot yet, as long as it still does full trip DNS validation :)

FWIW, the MS rep for msnbot at SES San Jose in '06 was the first to jump up and say they would support full trip DNS validation, followed by the ASK guy, putting the pressure on Google to comply as well.

dstiles




msg:4475909
 8:38 pm on Jul 15, 2012 (gmt 0)

I think your IPs are a bit sparse, Bill. I have well over 2 dozen RANGES never mind IPs. I have now reduced them to:

66.220.144.0 - 66.220.159.255
69.63.176.0 - 69.63.191.255
69.171.224.0 - 69.171.255.255

Many of the IPs are not bot-ish but I found I was continually adding IPs so I enabled the full ranges. No obvious adverse effect so far and in any case I parse the bot UA. :)

lucy24




msg:4475938
 11:29 pm on Jul 15, 2012 (gmt 0)

I'm positive I have outlined this in a prior thread you participated in

Ah, that was you. I remember someone giving an impassioned defense of FB. But earlier, someone else laid out the exact mechanism of how FB allows members to select an image for linking to a page.

Bottom line: I look at my logs. With no blocking or rewriting, the pattern is:

One ordinary human visit.
One matching visit by facebook within a few minutes after the human visit. (FB members apparently have short memories and must race to recommend the page before they forget.)
Multiple hits to a single image from that page, with some specific FB page as referer, scattered over the following days.
Occasional human visits to the complete page, with the "Danger! You are leaving FB!" page as referer.

That leaves me in the position of "Who am I gonna believe, you or my lyin' eyes?"

:: wandering off to find the one human I know who is on FB, so we can re-enact the process ::

incrediBILL




msg:4475940
 11:36 pm on Jul 15, 2012 (gmt 0)

I think your IPs are a bit sparse, Bill.


I didn't do a combined domain report, that was just the IPs hitting the site I was working on at the time I posted. I'm sure there's a lot more but they don't seem to hit that particular site for whatever reason.

dstiles




msg:4489509
 4:40 pm on Aug 29, 2012 (gmt 0)

I note a thread on similar lines denoting the 131.253.nn.nnn range being used for msn media bot. Today I got a swathe of bingbots from that range (131.253.46.0 - 131.253.47.255). It's an odd range because of the NetName in DNS and the incomplete lower-end IP (I would expect it to begin at .16.0) (see below) but the IPs used seem legit.

Full range: 131.253.21.0 - 131.253.47.255
CIDR: 131.253.32.0/20, 131.253.22.0/23, 131.253.24.0/21, 131.253.21.0/24
NetName: NTCSIS-NET
RegDate: 1989-02-13
Updated: 2011-06-22
OrgName: Microsoft Corp

Bot ranges, culled from MS's DNS, listed below. NOTE: Not all IPs in each range are bot rDNS but the ranges are sufficiently large that I ignored the discrepancies. The IPs were grep'd using the word msnbot - MS still do not seem to be using bingbot in DNS.

131.253.24.0 - 131.253.27.255
131.253.36.128 - 131.253.36.255
131.253.38.0 - 131.253.38.255
131.253.46.0 - 131.253.47.255

keyplyr




msg:4489521
 6:23 pm on Aug 29, 2012 (gmt 0)



Ah, that was you. I remember someone giving an impassioned defense of FB. But earlier, someone else laid out the exact mechanism of how FB allows members to select an image for linking to a page.


That was me also. Read it again and you may realize that is still not hotlinking.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved