Forum Moderators: mack
The MSN beta search contains approximately 40% of the pages from my site, where google contains 80 / 90 %.
I checked my main competitors, and the same ratio applies more or less. What about you?
Although the new msn is doing a great job adding fresh content, they can’t expect to breakout if they don’t catch up google on number of pages indexed per site.
Presumably that will sort itself out in time.
Do you have highly dynamic content that may be hard for MSNbot to interpret (or session variables?)
D.
they can’t expect to breakout if they don’t catch up google on number of pages indexed per site.
That's a myth perpetuated by webmasters who spend far to much time doing site: searches. Matching the total number of pages indexed on a given site has nothing to do with whether or not MSN will produce an engine of equal quality.
The reality is that regardless of how big your site is, the majority of your search engine traffic comes from a relatively small percentage of your total pages. All MSN has to do is make sure the pages they do keep include the ones that actually show up in SERPS. All the rest are simply taking up space.
The reality is that regardless of how big your site is, the majority of your search engine traffic comes from a relatively small percentage of your total pages. All MSN has to do is make sure the pages they do keep include the ones that actually show up in SERPS. All the rest are simply taking up space.
I don't like this argument.
Although some of my pages don't have high ranks and have low popularity, they nevertheless contains unique keywords that will certainly interest a minority of users that should not be trashed away.
This is precisely why I liked to use google: it would help me find THE precise keywords that I was looking for, hidden on a website lost in the dark.
This is what differentiate a search engine from a directory.
Indexing all the pages take up more space? Well it's not my problem. It is theirs. In my opinion, the best search engine is the one that will provide the most complete index and the most relevant results.
So if only the highest serp pages should be included, then lets have MSN index only the 20% “important” pages of our site and see how it will compete with google.
Brakkar
MSN has significant problems ranking deep pages from authority sites, while it favors front pages from useless network garbage. Search engines should be seeking to dig deeply into quality domains, and if they ignore anything, ignore the endless ten page template bits of nothing that just link to affiliate parents.
The whole issue should be addressed from a different perspective, which is MSN's great failing... quality content. If a search engine recognizes a site, not a page, as being a quality resource, it should dig deeply. If there are no signs of quality, it shouldn't make as much effort to index.
MSN crawls a lot but can't (yet) manage to index the first 1000 pages of basically anything. Google on the other hand has 100% success at this regularly. At the same time, MSN has a LOT of content indexed that Google (or Yahoo) don't. Unfortunately a lot of that is session id type of stuff.
MSNbot is pretty good, but still this problem of indexing networks of piffle while not getting 50% of 1000 page authority sites is an unfortunate circumstance that MSN should work to correct.
MSN crawls a lot but can't (yet) manage to index the first 1000 pages of basically anything.
For one of my sites, I have around 1,800 pages indexed in MSN beta, but it's still barely scratching at the surface (and is a tiny proportion of the pages crawled). Google has a little over 300,000 pages from the same site indexed, and is sending traffic to thousands of different pages on the site each day (because they contain unique search terms).
Another older site of mine has 2,600 pages in MSN beta, but over 60,000 in Google.
So I agree: the missing pages result in many missed search terms, and therefore a less useful index.
It seems MSN is going to spring up a major surprise with all pages crawled when they launch their new engine.
I would also like to note that MSN's crawlers have crawled all our sites completely over and over a million times ;)
Come on Bill Gates, hurry up...
That test, as used by many webmasters, goes: "less than n% of my visitors use browser X, so it is unecconomic to support it"
As applied to search engines, it would go: "less than n% of seaches would find this page, so it is unecconomic to include it".
Makes perfect sense both ways, surely?