homepage Welcome to WebmasterWorld Guest from 54.166.105.24
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
the plainclothes bingbot
lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4604030 posted 7:04 am on Aug 22, 2013 (gmt 0)

#1 Quick heads-up for anyone who has been identifying the plainclothes bingbot in something like this form:

RewriteCond %{REMOTE_ADDR} ^(65\.5[2-5]|131\.253\.[2-4]\d|157\.(5[4-9]|60)|207\.46)\.
RewriteCond %{HTTP_USER_AGENT} MSIE\ \d\.0;\ Windows\ NT

Redmond has finally (!) caved in and allowed their robot to use MSIE 10. So from here on it's

MSIE\ \d\d?\.0

#2 In the course of looking into this, I discovered that this mysterybot* seems to have some connection with social media. (Which one belongs to Microsoft? I forget :() This is by no means its only job-- but each time I get a flurry of preliminary visits from places like Twitter and facebook, the plainclothes bingbot is right there too.


* Every time someone thinks they've figured out what it does, it pulls some entirely new trick out of its hat, wiping out the latest hypothesis.

 

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4604030 posted 12:52 pm on Aug 23, 2013 (gmt 0)

Hey lucy,
How goes it?

is that the complete UA?

RewriteCond %{HTTP_USER_AGENT} MSIE\ \d\.0;\ Windows\ NT

I've had an "ends with NT" and a couple of other "ends with" in place for a long while.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4604030 posted 1:37 pm on Aug 23, 2013 (gmt 0)

No, it's just enough to put in a RewriteRule: "Comes from a Microsoft IP but appears to be a regular browser". No anchors.

:: shuffling papers ::

This latest one was, in full,

Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)

(MSIE 10 is the first one to admit to Mozilla 5, isn't it?)

:: further shuffling ::

A few more random UAs (the _ means double space, a characteristic of the group):

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; _ SLCC1; _ .NET CLR 1.1.4322; _ .NET CLR 2.0.50727; _ .NET CLR 3.0.04506.648)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; _ SV1; _ .NET CLR 1.1.4322; _ .NET CLR 2.0.40607; _ .NET CLR 3.0.30729; _ .NET CLR 3.5.30729; _ MS-RTC LM 8)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; _ SLCC1; _ .NET CLR 1.1.4325; _ .NET CLR 2.0.50727; _ .NET CLR 3.0.30729; _ .NET CLR 3.5.30729; _ InfoPath.2)

et cetera. They're pretty humanoid. There's also an MSIE 8, but it only seems to work BingSiteAuth.xml so it's exempt from RewriteRules. I don't think I've met an MSIE 9.

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4604030 posted 1:59 pm on Aug 23, 2013 (gmt 0)

Seems to me that I've a Rewrite in place which prevents this kind of crap (Plain Jane's) from MS bot ranges.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved