|Many (but not all) Pages Dropped from Index|
Index Page Dropped from Index but still PR 5
| 4:01 am on Jun 7, 2004 (gmt 0)|
Many but all of my about 1000 page Yahoo! Store, as of the last update a few days back, have been dropped from the Google index entirely.
For example the index page which is still a PR 5 and is showing over 100 back links will not rank for anything, included long quotes from the page. A search for the domain (minus the .com) will not rank our Web site anywhere, whereas it has ranked number 1 for four years.
This is causing us considerable harm and I can find no reason for it. The only idea I have is that it might be related to the fact that our site (as all Yahoo! Stores) has a store.yahoo & shop.store.yahoo URL in addition to our vainity domain. Google generally just filters out the Yahoo! domains, but some might suggest that the duplicate domains (although beyond my control) may cause the problem.
If anyone has any information I would greatly appreciate it.
At this point I am hoping that it is just Google burp and that the pages will be back in the index within a couple weeks.
| 6:02 pm on Jun 7, 2004 (gmt 0)|
Has anyone else had a domain experienced a major but not complete drop out of the Google index, while maintaining PR and back links?
| 6:19 pm on Jun 7, 2004 (gmt 0)|
Yes, this has been happening to us for months now.
Most of the sites we manage are PR5 some PR6 but positioning for all is essentially non-existant.
Backlinks are showing fine.
Most sites have lots of unique and useful content, yet for some reason Google doesn't want to actually position them any more.
Beyond confused as to what they want....
| 6:26 pm on Jun 7, 2004 (gmt 0)|
Your circumstances may be different. Our drop is Google referrals and Google rankings seems to be caused totally by Google dropping many of our pages from the index. These pages then will not rank for anything. So, in reality we are not dropping in rankings, just the pages that once ranked are no longer in the index.
Let me know if your pages were dropped from the index completely, of if they were just thrown to the bottom of the rankings. Also, let me know if they have come back at all.
| 6:40 pm on Jun 7, 2004 (gmt 0)|
I have a client whose site is going through this, but there are a couple of things he's done that might be causing it. First, he had cannonicalization problems - what I call the Dreaded Missing WWW's. Two versions of his main URL got into the index, once with the www. subdomain prefix and once without it. That's being alleviated with 301 redirects. The second is that they had begun a mirror site with a completely different URL. When I first checked a couple of weeks ago, I swear the mirror wasn't in the index, and I had the client take the site down immediately, but there's a couple of pages in there now with URL-only/partially-indexed entries as of this morning. I think the Googlebot might have found the mirror's URL by following my Toolbar activities because there are no links to the site out there that I can find.
My point is that I bet a lot of people are running into problems when Google finds what it thinks is duplicate content, even though the webmaster is not being overtly deceitful. If your host allows access to your site with or without the "www." its a good idea to get a 301 redirect set up right away. The acid test to to check the "site:" command with both forms of your URL to see what Google has in the index. I've had a couple of clients get caught by this and the 301's and Google's own reconciliation system do get you back in the ballgame in a few weeks. And if you've got any other URLs with similar content, you need to get those cleaned up, too, of course.
| 7:25 pm on Jun 7, 2004 (gmt 0)|
Duplicate content may be my problem too, but due to my host (Yahoo! Store) there is no solution. Every Yahoo! Store will have one's vainity domain, the store.yahoo domain and the shop.store.yahoo domain.
Further, though, I am getting different numbers for www and without www:
I try to keep all incoming links going to our vainity domain. In the past it seems that Google has detected the mutiple domains and decided to let the one with the most PR/Backlinks to be in the index. The vainity domain has far more, but there are some external sites that link to the other domains.
I think the only thing I can do is try to get those linking to the other domains to link to the vanity domain.
| 8:16 pm on Jun 7, 2004 (gmt 0)|
If Google finds duplicate content, doesn't Google normally still include one of the pages?
| 12:36 am on Jun 8, 2004 (gmt 0)|
In regards to this cannonicalization problem mentioned, if this were the case wouldn't the PR/backlinks drop and not increase?
We run an Apache server and I am wondering now if the configuration may have something to do with this.
| 1:14 pm on Jun 8, 2004 (gmt 0)|
I don't think Google is dropping sites for having duplicate content appearing at both example.com and www.example.com. If so, they would drop most of their index (although it seems like they have lately... doesn't it?). www is a server, not a subdomain. It is typical these days for both www.example.com and example.com to resolve to the same server for web requests. Unless you have dramatically different results when searching google using the site: qualifier, I can't imagine that this is the problem.
Personally, I think that the latest filter to catch a lot of sites has to do with crosslinking multiple sites at the same or similar IP addresses, or owned by the same owner. While this used to be OK, it isn't anymore. I'm not sure if you aren't allowed to have reciprocal links with one of your own sites, or any links at all. While this filter certainly gets rid of many spammy networks of sites, it has also deep-sixed many legitimate sites from the Google index.
I could be wrong... but we are all just groping in the dark, looking for the light switch. They sure do make it a fun hobby, don't they?
| 1:26 pm on Jun 8, 2004 (gmt 0)|
|it might be related to the fact that our site (as all Yahoo! Stores) has a store.yahoo & shop.store.yahoo URL in addition to our vainity domain. Google generally just filters out the Yahoo! domains, but some might suggest that the duplicate domains (although beyond my control) may cause the problem. |
This wouldn't be a factor. I have three domains all pointing to the same web site. Many people do. It's having duplicate SITES that's the problem.
| 4:49 pm on Jun 8, 2004 (gmt 0)|
|Your circumstances may be different. Our drop is Google referrals and Google rankings seems to be caused totally by Google dropping many of our pages from the index. These pages then will not rank for anything. So, in reality we are not dropping in rankings, just the pages that once ranked are no longer in the index. |
I've had the same problem. Thousands of pages dropped from the index / visible Google cache. That happened about 1 month ago and coincided with the time we added an iframe (that was to hold HTML ads, so the the SEs wouldn't have to download basically unchanged content everytime we changed the ads)
I had excluded the html iframe holding the ads via robots.txt
Googlebot continued spidering, but 1000s of pages were dropped from the "visible" index (including visible cache). Yet when Googlebot visited the "dropped" page again, it produced a http 304, ie the page was stored somewhere at Google, just not shown in the public index.
Only the higher-ranking pages (>PR3) were not dropped.
Btw the site is large, with lots of unique content (it's a fulltext archive of a newspaper). No duplicate content, each page (excluding html tags) is 5-50KB, no affiliate / amazon / etc links, nothing.
It has hundreds of real backlinks, going back in time to 1995.
Googlebot kept spidering the site all this time, but the visible cache (ie of pages not dropped) kept the "old" (pre iframe) copy of the pages.
It seems to be coming back gradually. And Googlebot is spidering 1000s of pages per day.
My theory is that the robots-excluded iframe triggered some flag for manual/human review, which took about 1 week or so.
| 7:27 pm on Jun 8, 2004 (gmt 0)|
|This wouldn't be a factor. I have three domains all pointing to the same web site. Many people do. It's having duplicate SITES that's the problem. |
It should not be a problem, but a whitehat’s Web pages are nonetheless the gone, and this is the best theory for now.
If Google identifies pages by their URLs, then multiple domains for one site is no different from a perfectly duplicate site with a different domain name. In either case Google will try to make sure that the duplicate sites are not awarded with duplicate rankings, so Google normally will block the additions domains. The theory is then, that in this case they blocked all of the domains, for some reason.
Indeed, many people do have multiple domains for the same site. Mine may have caught some attention because some high PR sites started linking to store.yahoo domain. So, the PR of the store.yahoo domain was getting relatively close to as high as the vanity domain. Maybe this made it look to the filter like we were trying to get both domains to rank.
The main reason the duplicate content filter seems most plausible, is because with the update was a change is the way Google views our different domains. Before the update each domain had its own PR and its own back links. After the update, Google has combined them altogether. When a search is done for the store.yahoo.com/hippygift-com/ (in order to check backlinks) it goes to the vainitydomain.com web page information. Now when I check for back links for a particular page, it does not matter which of the three domains was linked to, they all show. This was never the case before.
| 7:36 pm on Jun 8, 2004 (gmt 0)|
Google replied to an email I sent:
|Hi James, |
Thank you for your note. We searched for your index page and found that it
is currently included in our search results. To see the results of our
search, please visit the following link:
[CANNED RESPONCE PART DELETED]
The Google Team
I responded explaining that although the pages do come up on Google to show web page information, they do not show for any rankings whatsoever. I hope they will reply again.