joined:Dec 1, 2011
To all the suggestions that Bing should "do more". I agree.
For a topic on Observations on Search Engine and Crawler behaviors, I a while back tracked what they were doing.
Google basically do back-end prioritizations. Scrape everything, index almost everything into levels of DBs with various access prioritization, and mainly prioritize on search what to actually show. On the back-end of the data-stream.
Bing, with less infra-structure investment, do front-end prioritization. They "pick" which sites/pages to even load, based on ideas about the site overall and maybe guesses from neighboring pages that they have seen already. They index only a tiny fraction of the Internet, as compared to Google. It can literally take many years if they were to find everything, if they would even try. Example.
For example, I have old site, with a lot of distinct, unique product related pages. At one point, when I was tracking the engine behaviors, Google had 30,000+ of the site's pages indexed, and still ran like crazy to pick up more. Bing on the other hand then had a total of an enormous 38 (yes, thirty-eight) pages in its index. They obviously had decided not to be too interested in looking, although they still visit (re-visit) but not in a systemic way to get new content.
I just checked both a minute ago, and Google estimate it now has 44,300 pages, while Bing has managed to get all the way up to an exact 41. :)
Across all sites, I see the same experience. Bing take forever to get started (you add a site to their WebMaster tools, and it can take weeks before the first visit). It then takes forever to crawl, and lifts only a fraction of a site, maybe unless there is a potential for revenue for them. You cannot use knowledge you do not have access to.
Which leads me to the one thing thing about prioritization methods (back or front). One thing will always be true:
If a search engine has never loaded, let alone indexed a page, it by definition cannot show it to its users. It cannot find what it does not know about.
That tells you what it is worth to users that happen to search on topics Bing does not care about. They have no chance of finding what they are looking for.
So, if you on a front-end prio. SE search for main-stream topics, you are sure to find something (but you don't know what you are missing because the SE never checked it out). If you search for more obscure topics, there is little chance that it will be able to help you. Again, the SE cannot search what it never lifted.
A back-end prio. SE, such as Google, having scraped every nook and cranny of the web, has the ability to search even all the strange, obscure topics. Those that show up with not a single ad around them when you search. It even has the ability to change its topical prioritization on the fly, should something that was of no "interest" yesterday suddenly for some reason become important tomorrow. Maybe because of some major global event, or someone old/famous dying suddenly making every obscure fan-page relevant.
But, operating that way takes a LOT of infra-structure.