Forum Moderators: Robert Charlton & goodroi
If they don't do any harm I'm inclined to just leave them. There doesn't seem to be much we can do once they are marked supplimental.
I did find if I leave links to pages I've just deleted it seems to get rid of the pages for good. I hate to have to link to nonexistant pages but it seems to be the only thing that works.
But once pages go supplimental it's too late.
They investigate.
When they discover it is a genuine 404 they HAVE to remove it because its use is not authorised by the copyright owner.
Assuming YOU are the copyright owner, and YOU don't want it in G's supplemental index.
G honour this in my experience.
Be sure the old url does return a genuine 404 first.
Simple and safe eh?
If G do refuse, sue them and retire.
Stu2's question has nothing to do with Copyrights, DMCA complaints or the like. It has to do with URLS of his own site that he himself has removed via the removal tool
Stu2, I wonder the same thing, I removed URL recently that serve up a 404. The 404 is still in place and I intend to keep it that way, however I wonder if I am going to have to go through the hassle of getting them removed again in 6 months of if they will stay out of the index for good since they all lead to 404's. My guess is that at the end of 6 months I'm going to be dealing with the problem again because as we all know, if you remove something from Google via the removal tool, even if you block access to it via robots text, serve a true 404 page or block via robots meta tag, when the 6 months is up, Gogle will show it once again.
If not, it's worth trying to make a link to them while they return 410 and leave this link for many months.
Another way I'd try, is to place 301 from these URLs to the most related section of the site, and put a link to them and leave it for many months.
It appears to me that 301 sometimes really work, in contrary to opinions on WW that it doesn't. Indeed, it works extremely slowly, but still, after a long long time, Google sometimes learn that the page is outdated link and drops it from the index.
For example, one of my sites has three language versions, and for almost a year the main domain displays English version (it used to cloak by ACCEPT_LANGUAGE and hostname, but I removed this to not risk violation Google guidelines). There was an URL forcing English version, 'mydomain.com/en/', and I put 301 on it, leading to 'mydomain.com', to avoid duplicate content, but I still have links to /en/. There is no problem at all, Google doesn't ever show /en/ in index as supplemental.
I have other sites where I can't get rid of supplementals and use URL Console to solve it quickly, but of course they return after every six months. I think the problem is to leave a link from high PR page to old URLs to ensure 301 is often crawled. But you can't put many links on high PR page only to force Google to drop supplemental pages, because a bunch of outdated links is not necessarily the best thing to put on such page.
If you put up a new page up at that URL, then that new content will show as a normally indexed page, but for any words that were unique to the old version of the page, that page will continue to show up as a supplemental result for ever more. It does this even if the cached page is brand new. That is, the URL continues to show up as a supplemental result for old content that is no longer on the page, and no longer in the cache. This behaviour has been around for nearly two years.
Even if the URL has a 301 redirect to somewhere else, Google continues to rank the URL for the content that used to be on that page, and displays the URL as a supplemental result for ever too.
Looking at the test DC [64.233.179.104] they seem to have fixed both of those errant behaviours for some results, but by no means for all of them.
You can't.
I started this thread,
[webmasterworld.com...]
but the bottom line is you can't get rid of them. What you can do is the various things to make it obvious that these pages no longer exist... 301, keep up links to the 301... and someday before we all die, Google might actually begin to obey 301s properly.
I am waiting to see if there is some other factor involved or whether Google simply has a new method, or new data, or has dumped the old data (or whether they have just "hidden" it (as that is what they normally do: they rarely delete something completely, they just hide it from public view {and then it suddenly reappears again many months later, and for no apparent reason}).
Stu2's question has nothing to do with Copyrights, DMCA complaints or the like. It has to do with URLS of his own site that he himself has removed via the removal tool
With respect ledfish, you missed it :)
Read my post carefully... and think, then you will see that it IS the answer to this problem.
To get the resurfaced url out of the G supplemental index file a DMCA with them. Citing G as the infringer.
It works, and is safe.
I started this thread,
[webmasterworld.com...]
but the bottom line is you can't get rid of them.
That's the one I was looking for. Thanks. It's only a hobby site, so it's not so important. Just untidy. Here's to hoping Google fixes this before hell freezes over :)
It's only a hobby site, so it's not so important. Just untidy.
It's unimportant as long as Google doesn't decide they are duplicate content. That was my problem. I did some reorganization and moved some articles to new URLs. But Google kept the old URL's as supplimental and a search showed them listed for the same keywords as if they were still online. The result make it look like I had two copies of each article.
What I would like to know if this could cause a duplicate content penalty? Does anyone know or have theories on this?
Waggle the wheels of your wagon to get out of the rut your minds are stuck in.
It's like watching a puppy chase it's tail. :)
Funny, cute, but a terrible waste of energy.
angon, puppies are cute, I think google makes the supplemental pages just for them. I just ignore supplementals, after of course putting in proper 301s, and haven't really ever seen that issue on any site I've done, including full site rewrites, all new urls etc. Of course, creating those 301s, not missing anything, that's very difficult, my guess is that at least 90% of attempts to do this fail to completely update each and every page, thus creating dupe pages in the process.
Since most, not all, IIS installations do not support rewriting natively, most, not all, IIS sites that undergo such site navigation/structure changes, will have significant supplemental issues, of course.
The 301s, if put in place correctly from day one of the change, seem to solve any issues in that area. While I believe everyone who sees and has these problems, I can't duplicate them myself, no matter how big or small the site is that gets rewritten.
Of course, that's ignoring a certain class of supplementals, urls blocked in robots.txt but still added to the site url index, those exist as supplementals but don't matter at all in terms of ranking, my personal theory on those is that google just shows them to bug WebmasterWorld google forum posters, since nobody else will really see them in most cases - most, not all...
The main trick, from what I've seen, is to have the rewrites constructed before the changes go up, then to put the update and the new rewrites up live at the same exact time, that seems to create a very smooth transition, no matter how big the change is.