homepage Welcome to WebmasterWorld Guest from 54.147.196.159
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Microsoft / Bing Search Engine News
Forum Library, Charter, Moderators: mack

Bing Search Engine News Forum

    
robots.txt mm-mm yum
lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4407204 posted 7:37 am on Jan 16, 2012 (gmt 0)

Has anyone else noticed an absolutely morbid taste for robots.txt in the bingbot? Overall about 3/4 of their pickups are robots.txt. In the last few days I've found stretches where they get 20 or more copies of robots.txt for every one real page.

Looking back, they've always had a taste for robots.txt. Early last year-- I pulled some random chunks of logs to check-- it ran about half and half. But now it's getting to the point where, if the bingbot were a human, I'd ask if they were feeling all right and suggest that they should see a doctor. Of, ahem, one kind or the other.

I had to draw a quick graph to confirm one last hunch. Bing is crawling my site a lot more than they did a year ago. But they're not picking up more pages. That's remained pretty constant. What they're doing is making more and more requests for robots.txt.

Dear bingbot. Is there anything we can do to help you?

 

tangor

WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4407204 posted 8:30 pm on Jan 16, 2012 (gmt 0)

I've been seeing the same thing in recent years... put that down to the sheer number of distributed IPs they are using. Seems a mish mash failure for the back end to communicate with the front line crawlers.

But, since my robots.txt is only a few hundred bytes (I only allow five in, all others out) it has not been an issue.

bingdude



 
Msg#: 4407204 posted 7:34 pm on Jan 31, 2012 (gmt 0)

So the complain this we're hitting the robots.txt file too much?

Guide to indexing in Bing:
[onlinehelp.microsoft.com ]

Submitting discreet URLs to Bing:
[onlinehelp.microsoft.com ]

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4407204 posted 10:38 pm on Jan 31, 2012 (gmt 0)

Well, your current record is 102 (one hundred two) consecutive successful requests for robots.txt. This seems a bit over the top :) Is it bing's subtle way of telling me that they don't like the robots.txt and keep hoping it will change? Is bingbot offended because it's lumped under the generic "Hey you!" of "User-Agent: *" instead of getting its very own set of rules?

I hope you meant "discrete". The idea of a discreet URL does make an interesting mental picture, though.

Quick run through the logs confirms that it isn't having any trouble finding pages. Including :: cough, cough :: a fair number of explicit "index.html" that sneaked into various internal links. It's just the infatuation with robots.txt that makes it all look so skewed.

bingdude



 
Msg#: 4407204 posted 12:15 am on Feb 1, 2012 (gmt 0)

Well, your current record is 102 (one hundred two) consecutive successful requests for robots.txt. This seems a bit over the top :) Is it bing's subtle way of telling me that they don't like the robots.txt and keep hoping it will change? Is bingbot offended because it's lumped under the generic "Hey you!" of "User-Agent: *" instead of getting its very own set of rules?

I hope you meant "discrete". The idea of a discreet URL does make an interesting mental picture, though.

Quick run through the logs confirms that it isn't having any trouble finding pages. Including :: cough, cough :: a fair number of explicit "index.html" that sneaked into various internal links. It's just the infatuation with robots.txt that makes it all look so skewed.


First off, let's raise a glass here. Lucy posted the 17,000th post in this forum! :)

Now, this could also be an indication that we like your site. ;) Bingbot does follow the robots.txt directives, which means it does access the file when it visits the website. So... frequent visiting could see us accessing the file frequently... You want we should move along...? ;) (Kidding!)

I'd be concerned if the robots.txt file was the ONLY file we checked out on the site and if we stopped indexing pages. That would be an issue.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Microsoft / Bing Search Engine News
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved