If they are truly not duplicates and you think they are useful to your readers then you should not let Google influence your site decisions. Keep them! Just my opinion :)
birdman's advice is good advice.
if you know for sure they are not duplicates, leave them
MC hinted towards another "data push" later this month, perhaps you should wait for that.
if that doesnt sort it, do all the usual checks, IBLs, amount of url parameters (if dynamic), checking for dup content online, etc
IMHO leave them, let google sort its own problems out ;o)
Had the same question.
Noticed some of the pages were old redirects for renamed pages or outdated, and annoyingly ones I didnt want in the index any more.
Just seemed as though saint G was trying to tell me something!
So I've just deleted some updated others and included the deletions and so on in the robots.txt disallow.
On a related tangent. It would be nice if there was some sort of clue G.
And erm ... if it isn't linked from the site navigation then obviously I no longer want that page to be available to customers. But then I suppose I should build sites for search engines instead.
Question: If I were to find a duplicate page that is also in the supplimental index, should I delete it and throw a 404 page?
Why wait to see if you have duplicate content that is indexed, make sure that you don't have any duplicate content at all.
|I'm now considering deleting all of the supplemental pages that show for a site: command. |
Not only you shouldn't delete supplementals (needless to stress IF they're valuable) but you shouldn't IMO delete good pages that went off the cache for some reason either.
I lost half the site some time ago (I think it was in April) - I mean completely off the index, not even supplemental.
The pages that were off were both "good and bad". However, there were some "bad" that remained in the index all the way through. In a post a couple months ago I was mourning about my excellent pages being lost.
I made a clean up and completely deleted unecessary stuff - I must have deleted around 80 pages - some of them were PR4 pages, btw, all others PR3 and most of them (ironically) were in the main index!
I DIDN'T DELETE any of the pages that contained valuable, new content and that were off the index at that time.
May I add that during the "process" some pages re-appeared as supplemental.
Now, ALL my excellent pages are back in the main index and most deleted are no longer cached. Not one supplemental and only 2 in the "omitted" results. Meaning, I presently show 611 pages in the main index. The pages that were off the index during that time, even the very good ones lost their PR and are now showing PR0 (from PR3), but they seem to have started re-appearing in the serps regardless of the low PR.
The home page is still not doing well on my two main keywords but seems to be progressing - I hope that the re-inclusion of the 300 pages will affect future rankings. I also think that lost PR will re-appear in the next update provided the pages remain cached :)
May I also add that some pages located 3-4 clicks away from the home page that had disappeared previously are back into the main index too.
I am rather optimistic.
I've been suffering the supp blues for about 4 months now. Just this morning I see some major light at the end of the tunnel. I have a 2000 page site, recently google has been indexing 1200 with about 250 not supplemental. As of this morning the google index is now reporting 550, but there are NO supps. I'm hoping this is a good sign... and traffic is up significantly.
IMHO Google is too unstable right now to take such a drastic action.
if there were duplicates in the supplimental, did you leave them?
I got tried of waiting for Google to get their act together and disallowed just Google from Robots.txt to clean up their mess.
It has been a week and the messy supplementals are still there.
Google is broken. I say broken because if many of my sites have this issue, i assume others do as well. Therefore to me it means that Google has far less pages cached correctly in their index. I am using Yahoo lately until Google fixes this major issue.
There seems to be few different takes on this. My problem may be that my content has been duplicated elsewhere and is sitting on 20 or even 100 scraper sites that are also supplimental... or whatever. How can you tell?
It's about 1200 html pages so if I am to deleted them what is the best way? Do I set up a standard error page with a redirect of some sort?
Don't bother - I have a bunch of pages I moved to their own site on June 1. Its only been 6 weeks, but the majority of the pages are supplemental on the old site. They rank on the new one, but the 301 hasn't created a deletion from the index for the pages on the old site.
I assume this would mean that you deleting your pages would not change their supplemental status :(
Google seems to think reporting the past status of pages to Joe Surfer is important - I can't understand why.
If you are moving to a new domain, or simply just redirecting to www on an existing domain, then just make sure that all of the wanted URLs are getting properly indexed.
The fact that old URLs, the ones that are now redirected, still show up as Supplemental Results is irrelevant - that is what Google does. They show the old URLs for a year or two until they eventually realise they are worthless.
As long as someone hitting one of those old results gets redirected to the correct URL for where the content really is, the average surfer will never realise that those URLs are old ones.
[edited by: g1smd at 12:20 am (utc) on July 17, 2006]
Regardless of anything else, deleting supplemental pages does nothing in terms of Google. They will still be listed as Supplemental indefinitely... and any new pages with that content will be duplicates of the deleted supplemental. (Sheesh, even just typing that makes me wonder what the hell they are doing down at the plex.)
Well if I work on the assumption that the content has been copied and as such is no longer usefull for me in terms of search because google is handing me a penalty? then to delete it seems to be the right course of action. If I consider the content is for the users only then it stays but in effect my users have dropped off due to the search issues.....