Welcome to WebmasterWorld Guest from 54.159.19.75

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Now seeing Bingbot

Testing the waters a few days early

     
3:12 am on Sep 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Spotted in the logs today.

207.46.195.227 Tue Sep 28 20:24:15 2010 "GET /widgets.html HTTP/1.1" 200 4321 "-" "Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)" Connection="Keep-Alive" Accept="*/*" Accept-Encoding="gzip, deflate"

Jim
3:39 pm on Sept 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


About 90 minutes ago:

msnbot-207-46-195-242.search.msn.com
Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)

robots.txt? NO

That was four minutes after:

msnbot-207-46-13-51.search.msn.com
msnbot/2.0b (+http://search.msn.com/msnbot.htm)

robots.txt? NO
6:46 pm on Sept 29, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:July 13, 2010
posts: 119
votes: 0


How do you keep track of what and when the bots are visiting? Which software?
6:56 pm on Sept 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 6, 2002
posts:1825
votes: 21


You don't need a software, linux grep command is your friend.
7:00 pm on Sept 29, 2010 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14622
votes: 88


I'm seeing bingbot today as well but only 13 visits opposed to msnbot's 3757 visits by noon.
7:11 pm on Sept 29, 2010 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3091
votes: 2


First hit around midnight GMT. Wasn't sure of the full UA so only trapped part of it. More difficult to spot in logs now: used to be easy when the bot name was at the start of the UA.
7:12 pm on Sept 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:Sept 29, 2010
posts: 1258
votes: 0


I'm seeing a few visits today from bingbot as well, yet MSNbot hit me over 2322 times. Still using both I assume...or maybe bingbot is still in beta?
7:34 pm on Sept 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Aside to Globetrotter: If you use Unix and can install/execute CGI scripts...

I tried paid-for, gazillion-stats scripts like Summary [summary.net...] and found they provided literally too much data. (Plus server-based updating was a pain.)

So instead I use little Perl scripts that incorporate the 'tail' command and quickly show the last 500 or so lines of my access, error, and mod_rewrite logfiles as web pages I then easily read in any browser. I also use another script that formats the same data by Host/IP, UA, files hit, referrer, etc.

The small scripts began with a nondescript error log 'tail' script which I then customized and modified to tail other logs and also match site layouts. The original log tail script I use is no longer available but here's an example of another free one: [perlscriptsjavascripts.com...]

The more complicated script is "Web Activity" by Matt Kruse. (I use an older version.) It's free and also customizable (by hand; do not mess around with scripts if you're new to Perl.) [mattkruse.com...] I depend on that script. I'd feel blind without it.

(Last but not least, I use Google Webmaster Tools, and Google Analytics.)

For additional ideas, check the other forums here. Lots of people will have lots of info about this, that and the other stats programs.
7:39 pm on Sept 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


How do you keep track of what and when the bots are visiting?


Raw logs are the best, however many find them cumbersome.
11:30 pm on Sept 29, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Bingbot/2.0 seems to be following msnbot/2.0b around, just warming up to take over the job on or about Friday (as scheduled). I don't see any bingbot/2,0 requests for robots.txt either, so I assume that it's basing its crawling on msnbot/2.0b's robots.txt fetches.

That makes sense, since bingbot/2.0 is probably identical or at least very similar to msnbot/2.0b, but without the "b" for beta. If it was a whole new crawler, they'd more likely have named it bingbot/1.0 -- there'd be no sense in carrying the "2.0" forward if it was new.

Jim
8:30 am on Sept 30, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 16, 2004
posts:1341
votes: 0



Bingbot, the Sequel
[bing.com...]
1:02 pm on Sept 30, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Dangbot, the Sequel:

- Despite years-long site verification via meta tag on a "default webpage," Bing's Webmaster Tools now says, "Site ownership has not been verified."

- Despite still using the exact same verification code in the exact same tag (re)given by Bing Tools, .search.msn.com only requests "BingSiteAuth.xml", not the default page.

- Despite performing routine rDNS lookups and okaying bare IPs to confirm .search.msn.com and then limit same to certain MSN UAs (msnbot, now bingbot, etc.; none of their misc. junk), the Bingbot verification UA is -- wait for it --

msnbot-[yada-yada].search.msn.com
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0)

- Despite uploading the danged .xml file even though I have the danged meta tag and re-re-reparsing .htaccess for over an hour punching holes, I'll be danged if I still can't (RE)verify.

Moral of the story: Make sure you're (still) verified. Good luck.
8:38 pm on Sept 30, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


You'll need to get a new verification key I believe, regardless of whether you use the meta-tag or .xml file method.

In all fairness, this is not a bingbot thing, but rather a result of switching from MSN Webaster tools to Bing Webmaster tools.

However, someone should point out to them that if they're going to go through all of this hassle --and put us through it as well-- then they should have named all of this stuff "MSbot" or "MicrosoftBot" and dispensed with the "cute branded name" for Webmaster-facing resources in favor of one that probably won't ever have to change again...

As it is now, we've got msnbot, bingbot, and Yahoo! Slurp all crawling for essentially one index, and no information on how long we'll have to support all these user-agents.

Jim
9:47 pm on Sept 30, 2010 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14622
votes: 88


They did a half-baked job on switching to Bingbot IMO because the full trip DNS verification still same MSNBOT!

Example:

207.46.13.42 - "Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)"

Rev DNS of 207.46.195.106 is msnbot-207-46-195-106.search.msn.com.


How confusing is that?

Ill thought out, ill conceived, making changes for no other purpose than branding, messing with webmasters making them waste time on meaningless updates.

Then you still have them using MSNBOT as the name in reverse DNS so you're still checking for MSNBOT and BINGBOT together!

What a big fat hairy mess for no particular reason and the only thing I'm thankful for is we didn't go down this same path with a Livebot in the middle of it all!

The only problem with jdMorgan's suggest of MicrosoftBot, which I like, is that the entire internet search unit isn't poised to easily sell if the crawler identifies itself as either Microsoft or even MSN for that matter.

Just a thought ;)
10:35 pm on Sept 30, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Jim: Thus far, on at least two sites, the old msnbot and 'new' bingbot 32-alphanumeric keys/tags/codes are identical. And even though I repeatedly tell Tools I use the tag, bingbot (not msnbot) looks for "BingSiteAuth.xml".
1:11 am on Oct 1, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


Ill thought out


Same as when they began in 2003.

You'd think (at least normal people would) that MS would have learned from those previous mistakes?
Shhesh. . . .
7:26 pm on Oct 8, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


And now, to further illustrate their QA program, they've apparently deployed another bingbot user-agent:

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)


Note the addition of a semicolon following "bingbot/2.0".

Jim
10:41 pm on Oct 8, 2010 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3091
votes: 2


Which is more correct than the semi-less one but we'll have to see which wins. :)
12:39 am on Oct 9, 2010 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9063
votes: 2


Has anyone seen bingbot fetch robots.txt yet? I'm still seeing bingbot simply following on from a hit by msnbot on the robots.txt.
3:15 am on Oct 9, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


That's what I see so far as well. The robots.txt fetcher is probably a different program, and its user-agent string probably hasn't been updated yet.

Jim
7:54 pm on Oct 10, 2010 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3091
votes: 2


Since the first appearance of the semi-colon there has only been a single instance without it here, and that was within the first few minutes. I'm switching to the semi UA.
1:19 pm on Oct 15, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 24, 2002
posts:894
votes: 0


Two sites, first time one visit each with

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

asked for robots.txt and home page
6:34 pm on Oct 18, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Dear MSN coders: Even if your bots may be 'sharing' robots.txt, it's long past time to get your UA act together, and stop cloaking, too. [webmasterworld.com...] E.g., a mere three minutes apart:

msnbot-207-46-204-170.search.msn.com
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648)
10/18 01:49:19
(URI: root)

msnbot-207-46-12-238.search.msn.com
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
10/18 01:46:43
(URI: robots.txt)
12:51 am on Oct 22, 2010 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14622
votes: 88


bingbot user agent crawling is more dominant now, I got about 5K hits from it so far today, only 108 from deprecated msnbot
3:00 am on Oct 22, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Yes, and a Staffa pointed out on the 15th, now seeing the bingbot UA fetching robots.txt. I'm looking forward to shortening/simplifying my robots.txt file (and the code that produces it), so I'm hoping they'll pull the plug on msnbot soon...

Jim
2:23 am on Nov 1, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Bingbot just hit from a bare (no rDNS) MSN IP...

MSN's many cloaked bots. Again. [webmasterworld.com...]
(link to #msg4224716 may be iffy)
3:00 am on Nov 1, 2010 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:5805
votes: 64


Bingbot just hit from a bare (no rDNS) MSN IP...


Thanks for the heads-up Pfui. Guess I was asleep at the wheel and blocked this one.

157.55.16.231 - - [30/Oct/2010:01:50:32 -0700] "GET www.example.com HTTP/1.1" 403 479 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
10:50 pm on Nov 1, 2010 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3091
votes: 2


Yep, got those too!

Hits in the general range 157.55.16.0 - 157.55.18.255 seem to be hitting with bingbot (total 13 IPs so far), all blocked.
12:41 am on Nov 2, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


@keyplyr: Are you allowing Bing from bare MSN IPs, not just rDNS-confirmable .search.msn.com addresses?
6:06 am on Nov 2, 2010 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:5805
votes: 64


@Pfui I wasn't, but am now. Who knew?
This 31 message thread spans 2 pages: 31