Welcome to WebmasterWorld Guest from 54.159.214.250

Forum Moderators: Ocean10000 & incrediBILL

Now seeing Bingbot

Testing the waters a few days early

   
3:12 am on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Spotted in the logs today.

207.46.195.227 Tue Sep 28 20:24:15 2010 "GET /widgets.html HTTP/1.1" 200 4321 "-" "Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)" Connection="Keep-Alive" Accept="*/*" Accept-Encoding="gzip, deflate"

Jim
3:39 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



About 90 minutes ago:

msnbot-207-46-195-242.search.msn.com
Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)

robots.txt? NO

That was four minutes after:

msnbot-207-46-13-51.search.msn.com
msnbot/2.0b (+http://search.msn.com/msnbot.htm)

robots.txt? NO
6:46 pm on Sep 29, 2010 (gmt 0)



How do you keep track of what and when the bots are visiting? Which software?
6:56 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You don't need a software, linux grep command is your friend.
7:00 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'm seeing bingbot today as well but only 13 visits opposed to msnbot's 3757 visits by noon.
7:11 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



First hit around midnight GMT. Wasn't sure of the full UA so only trapped part of it. More difficult to spot in logs now: used to be easy when the bot name was at the start of the UA.
7:12 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member



I'm seeing a few visits today from bingbot as well, yet MSNbot hit me over 2322 times. Still using both I assume...or maybe bingbot is still in beta?
7:34 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Aside to Globetrotter: If you use Unix and can install/execute CGI scripts...

I tried paid-for, gazillion-stats scripts like Summary [summary.net...] and found they provided literally too much data. (Plus server-based updating was a pain.)

So instead I use little Perl scripts that incorporate the 'tail' command and quickly show the last 500 or so lines of my access, error, and mod_rewrite logfiles as web pages I then easily read in any browser. I also use another script that formats the same data by Host/IP, UA, files hit, referrer, etc.

The small scripts began with a nondescript error log 'tail' script which I then customized and modified to tail other logs and also match site layouts. The original log tail script I use is no longer available but here's an example of another free one: [perlscriptsjavascripts.com...]

The more complicated script is "Web Activity" by Matt Kruse. (I use an older version.) It's free and also customizable (by hand; do not mess around with scripts if you're new to Perl.) [mattkruse.com...] I depend on that script. I'd feel blind without it.

(Last but not least, I use Google Webmaster Tools, and Google Analytics.)

For additional ideas, check the other forums here. Lots of people will have lots of info about this, that and the other stats programs.
7:39 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



How do you keep track of what and when the bots are visiting?


Raw logs are the best, however many find them cumbersome.
11:30 pm on Sep 29, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Bingbot/2.0 seems to be following msnbot/2.0b around, just warming up to take over the job on or about Friday (as scheduled). I don't see any bingbot/2,0 requests for robots.txt either, so I assume that it's basing its crawling on msnbot/2.0b's robots.txt fetches.

That makes sense, since bingbot/2.0 is probably identical or at least very similar to msnbot/2.0b, but without the "b" for beta. If it was a whole new crawler, they'd more likely have named it bingbot/1.0 -- there'd be no sense in carrying the "2.0" forward if it was new.

Jim
8:30 am on Sep 30, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




Bingbot, the Sequel
[bing.com...]
1:02 pm on Sep 30, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Dangbot, the Sequel:

- Despite years-long site verification via meta tag on a "default webpage," Bing's Webmaster Tools now says, "Site ownership has not been verified."

- Despite still using the exact same verification code in the exact same tag (re)given by Bing Tools, .search.msn.com only requests "BingSiteAuth.xml", not the default page.

- Despite performing routine rDNS lookups and okaying bare IPs to confirm .search.msn.com and then limit same to certain MSN UAs (msnbot, now bingbot, etc.; none of their misc. junk), the Bingbot verification UA is -- wait for it --

msnbot-[yada-yada].search.msn.com
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0)

- Despite uploading the danged .xml file even though I have the danged meta tag and re-re-reparsing .htaccess for over an hour punching holes, I'll be danged if I still can't (RE)verify.

Moral of the story: Make sure you're (still) verified. Good luck.
8:38 pm on Sep 30, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



You'll need to get a new verification key I believe, regardless of whether you use the meta-tag or .xml file method.

In all fairness, this is not a bingbot thing, but rather a result of switching from MSN Webaster tools to Bing Webmaster tools.

However, someone should point out to them that if they're going to go through all of this hassle --and put us through it as well-- then they should have named all of this stuff "MSbot" or "MicrosoftBot" and dispensed with the "cute branded name" for Webmaster-facing resources in favor of one that probably won't ever have to change again...

As it is now, we've got msnbot, bingbot, and Yahoo! Slurp all crawling for essentially one index, and no information on how long we'll have to support all these user-agents.

Jim
9:47 pm on Sep 30, 2010 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



They did a half-baked job on switching to Bingbot IMO because the full trip DNS verification still same MSNBOT!

Example:

207.46.13.42 - "Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)"

Rev DNS of 207.46.195.106 is msnbot-207-46-195-106.search.msn.com.


How confusing is that?

Ill thought out, ill conceived, making changes for no other purpose than branding, messing with webmasters making them waste time on meaningless updates.

Then you still have them using MSNBOT as the name in reverse DNS so you're still checking for MSNBOT and BINGBOT together!

What a big fat hairy mess for no particular reason and the only thing I'm thankful for is we didn't go down this same path with a Livebot in the middle of it all!

The only problem with jdMorgan's suggest of MicrosoftBot, which I like, is that the entire internet search unit isn't poised to easily sell if the crawler identifies itself as either Microsoft or even MSN for that matter.

Just a thought ;)
10:35 pm on Sep 30, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Jim: Thus far, on at least two sites, the old msnbot and 'new' bingbot 32-alphanumeric keys/tags/codes are identical. And even though I repeatedly tell Tools I use the tag, bingbot (not msnbot) looks for "BingSiteAuth.xml".
1:11 am on Oct 1, 2010 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Ill thought out


Same as when they began in 2003.

You'd think (at least normal people would) that MS would have learned from those previous mistakes?
Shhesh. . . .
7:26 pm on Oct 8, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



And now, to further illustrate their QA program, they've apparently deployed another bingbot user-agent:

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)


Note the addition of a semicolon following "bingbot/2.0".

Jim
10:41 pm on Oct 8, 2010 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Which is more correct than the semi-less one but we'll have to see which wins. :)
12:39 am on Oct 9, 2010 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Has anyone seen bingbot fetch robots.txt yet? I'm still seeing bingbot simply following on from a hit by msnbot on the robots.txt.
3:15 am on Oct 9, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



That's what I see so far as well. The robots.txt fetcher is probably a different program, and its user-agent string probably hasn't been updated yet.

Jim
7:54 pm on Oct 10, 2010 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Since the first appearance of the semi-colon there has only been a single instance without it here, and that was within the first few minutes. I'm switching to the semi UA.
1:19 pm on Oct 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Two sites, first time one visit each with

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

asked for robots.txt and home page
6:34 pm on Oct 18, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Dear MSN coders: Even if your bots may be 'sharing' robots.txt, it's long past time to get your UA act together, and stop cloaking, too. [webmasterworld.com...] E.g., a mere three minutes apart:

msnbot-207-46-204-170.search.msn.com
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648)
10/18 01:49:19
(URI: root)

msnbot-207-46-12-238.search.msn.com
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
10/18 01:46:43
(URI: robots.txt)
12:51 am on Oct 22, 2010 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



bingbot user agent crawling is more dominant now, I got about 5K hits from it so far today, only 108 from deprecated msnbot
3:00 am on Oct 22, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yes, and a Staffa pointed out on the 15th, now seeing the bingbot UA fetching robots.txt. I'm looking forward to shortening/simplifying my robots.txt file (and the code that produces it), so I'm hoping they'll pull the plug on msnbot soon...

Jim
2:23 am on Nov 1, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Bingbot just hit from a bare (no rDNS) MSN IP...

MSN's many cloaked bots. Again. [webmasterworld.com...]
(link to #msg4224716 may be iffy)
3:00 am on Nov 1, 2010 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Bingbot just hit from a bare (no rDNS) MSN IP...


Thanks for the heads-up Pfui. Guess I was asleep at the wheel and blocked this one.

157.55.16.231 - - [30/Oct/2010:01:50:32 -0700] "GET www.example.com HTTP/1.1" 403 479 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
10:50 pm on Nov 1, 2010 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Yep, got those too!

Hits in the general range 157.55.16.0 - 157.55.18.255 seem to be hitting with bingbot (total 13 IPs so far), all blocked.
12:41 am on Nov 2, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



@keyplyr: Are you allowing Bing from bare MSN IPs, not just rDNS-confirmable .search.msn.com addresses?
6:06 am on Nov 2, 2010 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



@Pfui I wasn't, but am now. Who knew?
This 31 message thread spans 2 pages: 31
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month