homepage Welcome to WebmasterWorld Guest from 54.166.110.222
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 77 message thread spans 3 pages: < < 77 ( 1 2 [3]     
How to remove (some) Supplemental Listings
sort of... maybe
steveb




msg:746148
 12:20 am on Oct 17, 2005 (gmt 0)

Google's ill-advised Supplemental index is polluting their search results in many ways, but the most obviously stupid one is in refusing to EVER forget a page that has been long deleted from a domain. There are other types of Supplementals in existence, but this post deals specifically with Supplemental listings for pages that have not existed for quite some time.

The current situation:
Google refuses to recognize a 301 of a Supplemental listing.
Google refuses to delete a Supplemental listing that is now a nonexistent 404 (not a custom 404 page, a literal nothing there) no matter if it is linked to from dozens of pages.
In both the above situations, even if Google crawls through links every day for six months, it will not remove the Supplemental listing or obey a 301.
Google refuses to obey its own URL removal tool for Supplementals. It only "hides" the supplementals for six months, and then returns them to the index.

As of the past couple days, I have succeeded (using the below tactics) to get some Supplementals removed from about 15% of the datacenters. On the other 85% they have returned to being Supplemental however.

Some folks have hundreds or thousands of this type of Supplemental, which would make this strategy nearly impossible, but if you have less than twenty or so...

1) Place a new, nearly blank page on old/supplemental URL.

2) Put no actual words on it (that it could ever rank for in the future). Only put "PageHasMoved" text plus link text like "MySiteMap" or "GoToNewPage" to appropriate pages on your site for a human should they stumble onto this page.

3) If you have twenty supplementals put links on all of them to all twenty of these new pages. In other words, interlink all the new pages so they all have quite a few links to them.

4) Create a new master "Removed" page which will serve as a permanent sitemap for your problem/supplemental URLs. Link to this page from your main page. (In a month or so you can get rid of the front page link, but continue to link to this Removed page from your site map or other pages, so Google will continually crawl it and be continually reminded that the Supplementals are gone.)

5) Also link from your main page (and others if you want) to some of the other Supplementals, so these new pages and the links on them get crawled daily (or as often as you get crawled).

6) If you are crawled daily, wait ten days.

7) After ten days the old Supplemental pages should show their new "PageHasMoved" caches. If you search for that text restricted to your domain, those pages will show in the results, BUT they will still ALSO continue to show for searches for the text on the ancient Supplemental caches.

8) Now put 301s on all the Supplemental URLs. Redirect them too either the page with the content that used to be on the Supplemental, or to some page you don't care about ranking, like an "About Us" page.

9) Link to some or all of the 301ed Supplementals from your main page, your Removed page and perhaps a few others. In other words, make very sure Google sees these new 301s every day.

10) Wait about ten more days, longer if you aren't crawled much. At that point the 15% datacenters should first show no cache for the 301ed pages, and then hours later the listings will be removed. The 85% datacenters will however simply revert to showing the old Supplemental caches and old Supplemental listings, as if nothing happened.

11) Acting on faith that the 15% datacenters will be what Google chooses in the long run, now use the URL removal tool to remove/hide the Supplementals from the 85% datacenters.

Will the above accomplish anything? Probably not. The 85% of the datacenters may just be reflecting the fact that Google will never under any circumstances allow a Supplemental to be permanently removed. However, the 15% do offer hope that Google might actually obey a 301 if brute forced.

Then, from now on, whenever you remove a page be sure to 301 the old URL to another one, even if just to an "About Us" page. Then add the old URL to your "Removed" page where it will regularly be seen and crawled. An extra safe step could be to first make the old page a "PageHasMoved" page before you redirect it, so if it ever does come back as a Supplemental, at least it will come back with no searchable keywords on the page.

Examples of 15% datacenter: 216.239.59.104 216.239.57.99 64.233.183.99
Examples of 85% datacenter: 216.239.39.104 64.233.161.99 64.233.161.105

 

joeduck




msg:746208
 11:06 pm on Nov 12, 2005 (gmt 0)

I've done a lot of URL removing in an effort to solve ranking problems and it has not seemed to have any effect.

However I suspect that it could matter in many cases and I wonder if part of Google's current problem of killing good pages stems from conflicts between the gigantic and unmanageable supplemental index and the regular index.

annej




msg:746209
 11:30 pm on Nov 25, 2005 (gmt 0)

I think I might have the secret of how to get newly deleted pages out of Google rather than have them go supplemental.

I've tried everything on one set of pages. I had redone them all on new pages in order to update my site. I could never get my old deleted pages out of supplemental and truly gone from Google so I decided to move the new pages to another domain. I don't want to risk any penalty on my big domain.

When I deleted these pages I left the links to them (or what was them) on my site. They seem to be completly gone from Google now. But the old old ones remain. It seems if you can get Google to see the deleted pages right away before they are declared supplemental you can get rid of them.

The older deleted pages are still a problem. I'm tired of having a link to each "now deleted URL" on my homepage. It looks tacky. Should I just give up? That homepage gets a new date in the serps every 2 or 3 days so I KNOW Google must have found them. <sigh>

steveb




msg:746210
 12:17 am on Nov 26, 2005 (gmt 0)

While the new ones may look gone, that won'ty work. They will come back as supplementals someday (unless Google changes in the coming few months).

The only thing to do now is put a 301 on the links to the recently delted pages. You don't have to 301 to the new location if you don't want to. Just pick any page to 301 to. Then remove those links from your front page in two or three weeks. The newly deleted pages should not go supplemental now. Even better, make sure one or more links to the newly deleted/301ed URLs continue to exist (doesn't need to be a front page) so Google is forced to see the 301 regularly

annej




msg:746211
 3:29 am on Nov 26, 2005 (gmt 0)

It occured to me they may not really be gone. I guess as long as my site doesn't get penalized for duplicate content it doesn't matter.

I just don't understand why Google hangs on to all these deleted pages. I even found one in supplimental that has been gone for 4 or 5 years.

reseller




msg:746212
 7:21 am on Nov 26, 2005 (gmt 0)

annej

>>I just don't understand why Google hangs on to all these deleted pages. I even found one in supplimental that has been gone for 4 or 5 years.<<

As far as suuplementals are concerned, it seems that Google still niether forgive nor forget ;-)

Lets hope the next update to deal with the supplementals and canonicals issues as well.

annej




msg:746213
 2:18 pm on Nov 26, 2005 (gmt 0)

Lets hope the next update to deal with the supplementals and canonicals issues as well.

From what GG andf MC said early on that was the plan for this update. Something must not have worked out so it was back to the drawing board.

As long as they don't count the sups as duplicate copy and penalize the site it doesn't matter. Is there any evidence this has happened?

g1smd




msg:746214
 9:10 pm on Dec 25, 2005 (gmt 0)

Maybe some things starting to happen... [webmasterworld.com...]

claus




msg:746215
 10:23 pm on Dec 25, 2005 (gmt 0)

Just saw this. Anybody tried

<meta name="robots" content="noindex,follow">

- on the gone pages?

Ie. for each supplemental URL, insert a blank html page with that code in the head section and get it spidered. Use "follow" if you have a link to the new page, and "nofollow" if you don't.

annej




msg:746216
 10:36 pm on Dec 25, 2005 (gmt 0)

Anybody tried
<meta name="robots" content="noindex,follow">
- on the gone pages?

Oh yes, been there done that. Once the pages turn supplemental it doesn't do any good.

Not that you should't try. It used to work maybe it will again.

Now that I think about it if you set it up that way, leave links to the 'gone' pages maybe it would work. Just do it before they go supplimental.

I am hoping Google will do something to solve this problem. Makes more sense for them to do it from their end.

g1smd




msg:746217
 12:39 am on Dec 26, 2005 (gmt 0)

They are starting to solve it for a small number of pages on the test [64.233.179.104] DC, as of three days ago.

steveb




msg:746218
 3:17 am on Dec 26, 2005 (gmt 0)

I'd like to believe that but it seems far more likely that some previously visibile are just not showing. Since there have always been two different pools of supplementals, it isn't a big jump to now have three.

twebdonny




msg:746219
 11:57 am on Dec 26, 2005 (gmt 0)

Our new "plan". Banned Googlebot completely, will wait until site fades out, then reinstate pages slowly until
we can determine what is causing filters to trip/ penalties to be applied. Since we get zero feedback
from Google at all, this appears to be our only option
as we have tried all the "techniques" listed to no avail
over the last year. Hope we have better luck in 06.

g1smd




msg:746220
 12:37 pm on Dec 26, 2005 (gmt 0)

steveb: You might be right about yet another version of the supplemental index on the test DC, but for me I see one class of supplemental problem being fixed (after 2 years of waiting).

Pages that went supplemental and then had their content changed, still continued to show up as supplemental results for keywords no longer on the page, and correctly showed as a normal result for any current content that was searched for. The snippet would reflect the search that was made, but in both cases the cache was always a bang up to date modern copy from ~2 to ~10 days ago. For the supplemental result, words would show in the snippet that are no longer on the real page and no longer in the copy of the cache that Google was showing to the public. This appears to be fixed for some of the pages that I have tracked for the last 2 years or more.

For pages where the supplemental result represents a page that is now replaced with a 301 redirect to another version of the page, nothing has been fixed at all; neither is there any change for a URL that is really a long-term 404.

sobole




msg:746221
 4:13 pm on Dec 26, 2005 (gmt 0)

I've got a great idea. Why doesn't google just say "hey, that's a 404! That page isn't there anymore! Why don't we delete it from our index?" That would work fine... Is this a multi million dollar company, and they can't do this?

g1smd




msg:746222
 4:27 pm on Dec 26, 2005 (gmt 0)

I think that if they allowed that, then any spammer that was found out could simply delete the page, wait a few days for Google to unindex the page, then put the page back up and start the cat and mouse again. Google wouldn't want to make it that easy for spammers to make them forget a rogue page or site.

Google attempted to combat certain types of spam, and the usual "domain-hopping when found out" scenarios by inserting some latency into the system, latency that was useful in detecting duplicate content when spammers abandon a domain and start again on a new one, for example. Maybe that is where the supplemental results come in, a record of the previous content of a site that can be used to penalise a spammer migrating it to a fresh domain?

Whilst a good idea on the surface, maybe Google naively assumed that non-spammers would usually have "perfect" sites (regarding URLs and redirects), but in reality it has been found that even normal sites have duplcate content across www and non-www, mixed links pointing to both www and non-www within a site, and multiple URLs that can reach the same content, along with 302 redirects that go to error pages instead of serving a 404 for "page not found".

It must be very difficult to write a unversal algorithm to cover all eventualities, but it has taken Google a very long time to see that their database has a lot of "rogue" data within it. The test DC appears to have a fix for one problem, but I see no movement yet on several other classes of screw up.

GeeWhizzler




msg:746223
 11:58 am on Jan 16, 2006 (gmt 0)

[google.com...]

"Please also be assured that the index in which a site is included doesn’t affect its PageRank."

macdave




msg:746224
 2:14 pm on Jan 16, 2006 (gmt 0)

Heh heh. Sometimes the it's the words that aren't said that speak the loudest. By specifically mentioning only PR, Google is confirming what we already know: supplemental index pages are handicapped in ranking, crawl frequency, ability to redirect or remove from the index, etc.

And I wonder: could they also be handicapped when it comes to passing PR?

This 77 message thread spans 3 pages: < < 77 ( 1 2 [3]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved