Welcome to WebmasterWorld Guest from 188.8.131.52
Seems to me that Matt's recent message confirms my theory. We're either all a bunch of moaning idiots with low quality sites with a few innapropriate, spammy links scattered here and there...or...
The more I think about it the more convinced I am that the missing pages problem is being caused by a Backlink/PR issue (see Msg #15).
Tying together all of the evidence from my own experience, and that of others gleaned from the forums, erroneous or out-of-date backlinks would explain all of the missing pages.
The erroneous, or simply out-of-date, backlink information (which we cannot see) leads to insufficient PR (which we cannot see) and hence deep pages are not indexed.
We all know that a "link:www.mysite.com" does not show you the complete picture. But, since Big Daddy, it now shows just a tiny proportion of backlinks. Way less, than it used to show before Big Daddy. Why? Because either the backlink index hasn't been updated (and now dates back to mid 2005), or else because it has been updated, but the update process is buggy. Only a small handful of Google employees know which of these two possibilities is the case.
We know that the missing pages problem cannot be due to any kind of duplicate content filter, as some people are suggesting. If this were the case, then effected sites would see a proportion of their pages disapear. Some would lose 10%, some would lose 40%, and some would lose 95%. But that's not what we see. We see sites losing the vast majority of their pages or else losing no pages at all. The reason effected sites lose such high percentages of their pages is because of the hierarchical nature of a site. The number of pages increases with depth, and the artificially low PRs (based on innacurate and/or out-of-date backlink data) prevents the deeper content from being indexed.
The fact that Big Daddy was kick-started from an index dating back to the middle of last year, not only explains why the backlink data might be stale, but it also explains why ancient pages keep popping up on various data centres.
As further evidence: try a "link:www.mysite.com" and compare it to a search for "www.mysite.com". In my case, the "link:" search shows just 6 results, only one of which is external to my site. The one external backlink probably pre-dates when Big Daddy's index was seeded. The "www.mysite.com" search, on the other hand, finds hundreds of results representing hundreds of internal and external backlinks. Why aren't these showing up in the "link:" search? Is it because "link:" searches are well known for not showing you the complete picture? Or, has that well-known fact simply been obscuring the true cause of all of the problems? Namely, that the backlinks are simply missing from Google's backlink index.
[edited by: tedster at 8:25 pm (utc) on May 17, 2006]
I believe Google will fix all problems if any ,and the communication that Google has with webmasters is unique in the SE world.Do you really expect that MSN or Yahoo will put a Google Guy or Matt or Vanesa and lately Adam Lasnik to communicate with the webmaster community?
MSN and Yahoo don't need liaison officers dude, because they put staff aside to answer queries directly, and I for one have always received replies back offering at least hints as to what might be wrong or what needs to be done to sort out the problem. It would be a sad day if they ever put on liaison officers of the calibre of MC.
We now know that the site: operator is a bit buggy [webmasterworld.com], but that doesn't account for all the vanishing pages, because they are also dropping from the regular SERPs.
Because this thread is becoming more like a continually open chatroom rather than a focused topic discussion, I think it's time to lock it up.