Page is a not externally linkable
- Google
-- Google News Archive
---- Many (but not all) Pages Dropped from Index


rainborick - 6:40 pm on Jun 7, 2004 (gmt 0)


I have a client whose site is going through this, but there are a couple of things he's done that might be causing it. First, he had cannonicalization problems - what I call the Dreaded Missing WWW's. Two versions of his main URL got into the index, once with the www. subdomain prefix and once without it. That's being alleviated with 301 redirects. The second is that they had begun a mirror site with a completely different URL. When I first checked a couple of weeks ago, I swear the mirror wasn't in the index, and I had the client take the site down immediately, but there's a couple of pages in there now with URL-only/partially-indexed entries as of this morning. I think the Googlebot might have found the mirror's URL by following my Toolbar activities because there are no links to the site out there that I can find.

My point is that I bet a lot of people are running into problems when Google finds what it thinks is duplicate content, even though the webmaster is not being overtly deceitful. If your host allows access to your site with or without the "www." its a good idea to get a 301 redirect set up right away. The acid test to to check the "site:" command with both forms of your URL to see what Google has in the index. I've had a couple of clients get caught by this and the 301's and Google's own reconciliation system do get you back in the ballgame in a few weeks. And if you've got any other URLs with similar content, you need to get those cleaned up, too, of course.


Thread source:: http://www.webmasterworld.com/google_archive/24287.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com