Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Old urls are back as supplemental - 6 months after using the removal tool

         

stu2

2:20 am on Dec 25, 2005 (gmt 0)

10+ Year Member



As expected their back as supplemental results. I can't now find the thread of how to get rid of them permanently. IIRC, I need to recreate pages and get them back into the main index by linking to them from my other pages. Will inclusion in my sitemap do this since every page has a link to my sitemap? Then, when they are back in the main index, what do I do? 301 them? To where, any valid page? 404 them? 410 them? Can all these new pages I create be the same. Something like "hello world" or do they have to have real content? These pages currently have a 410 against all of them, but it hasn't helped afaik.

annej

6:47 am on Dec 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd like to know if these old URLs can hurt us. For example are they ever seen by Google as duplicate copy?

If they don't do any harm I'm inclined to just leave them. There doesn't seem to be much we can do once they are marked supplimental.

I did find if I leave links to pages I've just deleted it seems to get rid of the pages for good. I hate to have to link to nonexistant pages but it seems to be the only thing that works.

But once pages go supplimental it's too late.

Angonasec

10:26 am on Dec 25, 2005 (gmt 0)



A simpler way, that certainly works, is to send G a DMCA.

They investigate.

When they discover it is a genuine 404 they HAVE to remove it because its use is not authorised by the copyright owner.

Assuming YOU are the copyright owner, and YOU don't want it in G's supplemental index.

G honour this in my experience.

Be sure the old url does return a genuine 404 first.

Simple and safe eh?

If G do refuse, sue them and retire.

Ledfish

2:43 pm on Dec 25, 2005 (gmt 0)

10+ Year Member



Angonasec

Stu2's question has nothing to do with Copyrights, DMCA complaints or the like. It has to do with URLS of his own site that he himself has removed via the removal tool

Stu2, I wonder the same thing, I removed URL recently that serve up a 404. The 404 is still in place and I intend to keep it that way, however I wonder if I am going to have to go through the hassle of getting them removed again in 6 months of if they will stay out of the index for good since they all lead to 404's. My guess is that at the end of 6 months I'm going to be dealing with the problem again because as we all know, if you remove something from Google via the removal tool, even if you block access to it via robots text, serve a true 404 page or block via robots meta tag, when the 6 months is up, Gogle will show it once again.

Wizard

7:31 pm on Dec 25, 2005 (gmt 0)

10+ Year Member



I have experienced the same problem and tried to use URL Console again, but after making old URLs return 410 instead of 404, and I wonder if this can make any change.

If not, it's worth trying to make a link to them while they return 410 and leave this link for many months.

Another way I'd try, is to place 301 from these URLs to the most related section of the site, and put a link to them and leave it for many months.

It appears to me that 301 sometimes really work, in contrary to opinions on WW that it doesn't. Indeed, it works extremely slowly, but still, after a long long time, Google sometimes learn that the page is outdated link and drops it from the index.

For example, one of my sites has three language versions, and for almost a year the main domain displays English version (it used to cloak by ACCEPT_LANGUAGE and hostname, but I removed this to not risk violation Google guidelines). There was an URL forcing English version, 'mydomain.com/en/', and I put 301 on it, leading to 'mydomain.com', to avoid duplicate content, but I still have links to /en/. There is no problem at all, Google doesn't ever show /en/ in index as supplemental.

I have other sites where I can't get rid of supplementals and use URL Console to solve it quickly, but of course they return after every six months. I think the problem is to leave a link from high PR page to old URLs to ensure 301 is often crawled. But you can't put many links on high PR page only to force Google to drop supplemental pages, because a bunch of outdated links is not necessarily the best thing to put on such page.

g1smd

8:22 pm on Dec 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Once a page is supplemental it will continue to rank forever for the old search terms for words that used to be on that page.

If you put up a new page up at that URL, then that new content will show as a normally indexed page, but for any words that were unique to the old version of the page, that page will continue to show up as a supplemental result for ever more. It does this even if the cached page is brand new. That is, the URL continues to show up as a supplemental result for old content that is no longer on the page, and no longer in the cache. This behaviour has been around for nearly two years.

Even if the URL has a 301 redirect to somewhere else, Google continues to rank the URL for the content that used to be on that page, and displays the URL as a supplemental result for ever too.

Looking at the test DC [64.233.179.104] they seem to have fixed both of those errant behaviours for some results, but by no means for all of them.

g1smd

8:28 pm on Dec 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



All stuff removed using the URL Console gets returned to the index after 90 or 180 days, even if the page no longer exists, or redirects.

There is no mechanism at Google to check whether the URL should return to the index or be dropped: it is simply automatically re-added every time.

steveb

8:52 pm on Dec 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"how to get rid of them permanently"

You can't.

I started this thread,
[webmasterworld.com...]
but the bottom line is you can't get rid of them. What you can do is the various things to make it obvious that these pages no longer exist... 301, keep up links to the 301... and someday before we all die, Google might actually begin to obey 301s properly.

g1smd

9:08 pm on Dec 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see that they might have fixed this in the last few days [on the 64.233.179.104 test DC only] for a small number of the test URLs that I have been monitoring for the past two years.

I am waiting to see if there is some other factor involved or whether Google simply has a new method, or new data, or has dumped the old data (or whether they have just "hidden" it (as that is what they normally do: they rarely delete something completely, they just hide it from public view {and then it suddenly reappears again many months later, and for no apparent reason}).

Angonasec

11:23 pm on Dec 25, 2005 (gmt 0)



Stu2's question has nothing to do with Copyrights, DMCA complaints or the like. It has to do with URLS of his own site that he himself has removed via the removal tool

With respect ledfish, you missed it :)

Read my post carefully... and think, then you will see that it IS the answer to this problem.

To get the resurfaced url out of the G supplemental index file a DMCA with them. Citing G as the infringer.

It works, and is safe.

stu2

11:53 am on Dec 26, 2005 (gmt 0)

10+ Year Member



I started this thread,
[webmasterworld.com...]
but the bottom line is you can't get rid of them.

That's the one I was looking for. Thanks. It's only a hobby site, so it's not so important. Just untidy. Here's to hoping Google fixes this before hell freezes over :)

annej

3:09 pm on Dec 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's only a hobby site, so it's not so important. Just untidy.

It's unimportant as long as Google doesn't decide they are duplicate content. That was my problem. I did some reorganization and moved some articles to new URLs. But Google kept the old URL's as supplimental and a search showed them listed for the same keywords as if they were still online. The result make it look like I had two copies of each article.

What I would like to know if this could cause a duplicate content penalty? Does anyone know or have theories on this?

Angonasec

1:49 am on Dec 27, 2005 (gmt 0)



The answer to this problem has already been given in posts 3 and 10.

Waggle the wheels of your wagon to get out of the rut your minds are stuck in.

It's like watching a puppy chase it's tail. :)

Funny, cute, but a terrible waste of energy.

2by4

2:01 am on Dec 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"Funny, cute, but a terrible waste of energy."

angon, puppies are cute, I think google makes the supplemental pages just for them. I just ignore supplementals, after of course putting in proper 301s, and haven't really ever seen that issue on any site I've done, including full site rewrites, all new urls etc. Of course, creating those 301s, not missing anything, that's very difficult, my guess is that at least 90% of attempts to do this fail to completely update each and every page, thus creating dupe pages in the process.

Since most, not all, IIS installations do not support rewriting natively, most, not all, IIS sites that undergo such site navigation/structure changes, will have significant supplemental issues, of course.

The 301s, if put in place correctly from day one of the change, seem to solve any issues in that area. While I believe everyone who sees and has these problems, I can't duplicate them myself, no matter how big or small the site is that gets rewritten.

Of course, that's ignoring a certain class of supplementals, urls blocked in robots.txt but still added to the site url index, those exist as supplementals but don't matter at all in terms of ranking, my personal theory on those is that google just shows them to bug WebmasterWorld google forum posters, since nobody else will really see them in most cases - most, not all...

The main trick, from what I've seen, is to have the rewrites constructed before the changes go up, then to put the update and the new rewrites up live at the same exact time, that seems to create a very smooth transition, no matter how big the change is.

stu2

10:20 am on Dec 27, 2005 (gmt 0)

10+ Year Member



And here's the link to google explaining how to make a DMCA complain against them. www.google.com/dmca.html