homepage Welcome to WebmasterWorld Guest from 54.226.252.142
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Bing Bot hiccup
wilderness




msg:4690173
 5:51 pm on Jul 23, 2014 (gmt 0)

126 requests for robots.txt in just under four minutes.

 

jmccormac




msg:4690873
 8:28 pm on Jul 26, 2014 (gmt 0)

Good job you don't run a large website with sitemaps or the muppets in Bing would be trying to download the entire sitemap files each day.

Regards...jmcc

wilderness




msg:4690874
 8:37 pm on Jul 26, 2014 (gmt 0)

jmc,
I do have a custom site map, however Google is the only one that will utilize absent a fee.

Bing does crawl the entire site fairly often, however has failed to pickup page editions in a specific section that have been in place for more than a year.

jmccormac




msg:4690875
 8:45 pm on Jul 26, 2014 (gmt 0)

The sitemaps for the site here are about 3.5GB. Bing's idiocy in not understanding the basic protocol means that they try to download non-updated files each day. Google seems to be a lot better and does pay more attention to the protocol. When it comes to some areas of search engine work, Bing seems to be run more by dilettantes than professionals. They also seem to use a number of IPs for hitting the sitemaps that tend to operate largely independently. From what I remember, the robots.txt requests from Bing tend to appear in bursts so it could be a number of these IPs hitting for the robots.txt file in succession.

Regards...jmcc

keyplyr




msg:4690911
 10:43 pm on Jul 26, 2014 (gmt 0)

On one 250 static html (including a sitemap.xml) page site I manage, bingbot and msnbot request every single page 4x per day, each.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved