homepage Welcome to WebmasterWorld Guest from 54.205.254.108
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Remove Unwanted Index Entries (with 301 re-direct)
Just How Darn Long Does It Take?
jbgilbert




msg:806746
 5:12 pm on Aug 4, 2005 (gmt 0)

To make a long story short and not get too detailed...
About 60 days ago the hosting company for one of my clients make some mistakes, screwed up some re-direction, etc..., etc... -- leaving my client with two domains (A & B) each having the exact same content (only about 50 pages).

Well, Google and others crawled and caught many of the dups before we even knew it!

Since then (30 days ago) a permanent redirect was put in the .htaccess file of A to re-direct each page from A to the same page on B.

BUT,
- Google still shows all the A and B pages (although the B pages are indexed with title and description as they should be).

- Yahoo's index corrected itself fastest 14 days) for the internal pages, but it still shows the home page of A in it's index.

- MSN's index corrected itself quickly also (20 days) for the internal pages, but it also still shows the home page of A in it's index.

QUESTIONS:
1) How long will it take for Google to use the permanent re-direct and remove the site A pages from it's index?

2) Why does Yahoo & MSN remove all site A internal pages, but continue to leave (index) the site A index page?

Just so you have it, here is the .htaccess entry on site A:
Options +FollowSymLinks
RewriteEngine on
Redirect permanent / [site_B.com...]

 

g1smd




msg:806747
 10:19 pm on Aug 4, 2005 (gmt 0)

You have only redirected calls for the root index page on Site A to be redirected to the root index page of Site B, not any of the sub pages. So the answer as to how long for Google to delist site A is... "never". Well, never from anything that you have done - but one or other, A or B, will be delisted as duplicate content at some time (and it will probably be B, as I expect that A has more PageRank).

When you use the correct rewrite rule to redirect a whole site, it usually takes several months for everything to work right. The first effects are usually seen within days.

jbgilbert




msg:806748
 11:21 pm on Aug 4, 2005 (gmt 0)

g1smd,
(1)
I think you are mistaken about the re-direct (shown below) only redirecting the index page of A to the index page of B.

For example page I can tell you that this page
[site_A.com...] is indexed in Google and if you click on it you WILL get directed to
[site_B.com...]

(2) Thanks for the several months estimate. I'll try to be patient.

(3) Now, if say the re-direct is only redirecting the index page of A to the index page of B, why is it that Yahoo & MSN have already removed all pages BUT the index page?

Just so you have it, here is the .htaccess entry on site A:
Options +FollowSymLinks
RewriteEngine on
Redirect permanent / [site_B.com...]

g1smd




msg:806749
 12:41 am on Aug 5, 2005 (gmt 0)

Surely you need this on the old site:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^theoldsite.com [NC]
RewriteRule ^(.*)$
http://www.thenewsite.com/$1 [L,R=301]
RewriteCond %{HTTP_HOST} ^www.theoldsite.com [NC]
RewriteRule ^(.*)$
http://www.thenewsite.com/$1 [L,R=301]

and this on the new site:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^thenewsite.com [NC]
RewriteRule ^(.*)$
http://www.thenewsite.com/$1 [L,R=301]

jbgilbert




msg:806750
 1:04 am on Aug 5, 2005 (gmt 0)

g1smd... man I don't know.

I learned the method I had for the permanet re-direct somewhere in this forum and thought I had it right.

Now you have me confused and wondering? I'm not sure what to do. Thanks for trying to help though.

Is there a book out that covers this stuff? I find stuff on the forum and the web, but the solutions are always VERY different...

jdMorgan




msg:806751
 2:01 am on Aug 5, 2005 (gmt 0)

This code is a mixture of directives from two different modules (Redirect [httpd.apache.org] from mod_alias [httpd.apache.org] and RewriteEngine [httpd.apache.org] on from mod_rewrite [httpd.apache.org]), plus the Options [httpd.apache.org] directive from Apache core.

Options +FollowSymLinks
RewriteEngine on
Redirect permanent / http://www.site_B.com/

As such, it probably isn't doing what you expect, but it may be doing what you want. The single line

Redirect permanent / http://www.site_B.com/

is entirely sufficient however, since mod_rewrite's rewrite engine is not used with a Redirect directive. Redirect uses prefix-matching, and as such, will redirect all pages in the form you have employed.

If this is working for you, and if you have no trouble accessing site B, then we can assume that the two domains are hosted on separate (possibly virtual) servers or that one domain is hosted in a subdirectory of the other domain. If that's not the case --that is, if the two domains resolve to the same directory on the same server-- then a modified (trimmed-down) version of g1smd's code would be necessary:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?theoldsite\.com [NC]
RewriteRule (.*) http://www.thenewsite.com/$1 [R=301,L]

As to how long it takes the search engines to figure out which domain is the 'real' one, I'd say one or two index updates. Since you've already seen some changes, I'd guess your domains should be all straightened out in another 30 days or so. In the meantime, a stern letter to the hosting service would seem to be in order; They should either notify you ahead of time when making such changes, or keep their hands off your server configuration.

Jim

walkman




msg:806752
 4:02 am on Aug 5, 2005 (gmt 0)

Google loves to keep 404 and forwarded pages for ages. I know I'm losing money because the deleted page shows on the indent, and sometimes people pick that one...only to be surprised by the 404.html page. Average Joe is not going to know what "supplemental" means

Staffa




msg:806753
 5:49 am on Aug 5, 2005 (gmt 0)

Average Joe is not going to know what "supplemental" means

Nor does Joe knows that when getting to a 404 page and he deletes everything in the URL beyond the mydomain.com will get him to the index page ;o)

steveb




msg:806754
 9:44 am on Aug 5, 2005 (gmt 0)

Google has been picking up my 301's of normal pages within a couple weeks.

Google has not yet ever picked up a 301 for a Supplemental page (though I only have been trying with two, for about six weeks).

promis




msg:806755
 11:58 am on Aug 5, 2005 (gmt 0)

jdMorgan, I did my non-www to www (residing on same server) redirect through the virtuahost configuration as "Redirect permanent / [mysite.com...]
It works fine redirecting all non-www pages to www and all server headers return 301 code.
Isn't this enough. Do I have to use the .htacces with Rewrite engine on also? Thanks.

jbgilbert




msg:806756
 2:23 pm on Aug 5, 2005 (gmt 0)

JD

Just to make sure... a final check... using what you said:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?theoldsite\.com [NC]
RewriteRule (.*) [thenewsite.com...] [R=301,L]

Will a vist to the following "oldsite" URL get properly re-directed to the same page on the "new site" URL :

1) [theoldsite.com?...]
2) [theoldsite.com?...]
3) [theoldsite.com...] ?
4) [theoldsite.com...]

webdude




msg:806757
 4:36 pm on Aug 5, 2005 (gmt 0)

Google loves to keep 404 and forwarded pages for ages. I know I'm losing money because the deleted page shows on the indent, and sometimes people pick that one...only to be surprised by the 404.html page. Average Joe is not going to know what "supplemental" means.

I ended up recreating the 404 pages and redirecting them to similar pages to avoid this. A pain if you have lots of pages, but at least users get something when they click on the link.

jdMorgan




msg:806758
 3:52 am on Aug 6, 2005 (gmt 0)

promis,

The code I posted is intended for use in .htaccess for the case where both domains resolve to the same DocumentRoot directory. If you were able to use the Redirect directive, which makes no provision for conditional execution based on domain (HTTP_HOST) name, then this implies that your domains resolve to different DocumentRoots. Otherwise, you would have created an 'infinite' redirection loop.

So, since Redirect worked for you, your domains must resolve separately.

jgilbert,

Rather than waiting around for me (or someone) to answer, it might be faster to test the code, or analyze how it works by using the Apache documentation. It will do exactly what you describe, but of course, comes with no warranty.

Jim

promis




msg:806759
 4:21 pm on Aug 6, 2005 (gmt 0)

Thanks jdMorgan

jbgilbert




msg:806760
 9:03 pm on Aug 6, 2005 (gmt 0)

Just and update guys:

All of the pages from site A have now been removed from the indexes at Google, Yahoo and MSN leaving only the pages indexed of site B as wanted ---- EXCEPT THE DEFAULT PAGE!

Redirect permanent / [site_B.com...]
(in the .htaccess file definitely works.)

But, for the life of me I can't figure out why all 3 SEs didn't remove the default (www.site_A.com) from their indexes.

Yahoo took about 15 days, MSN about 20 and Google about 45.

Appreciate all the help guys, but will continue to question (until it's removed) why the SEs removed all the internal pages, but left the default?

g1smd




msg:806761
 10:30 pm on Aug 6, 2005 (gmt 0)


For a site that has had a 301 redirect in place for a year, the "olddomain" is still listed in Google as a URL-only result. I guess that their spider tests the URL every so often to see if it still redirects or has gone 404 or has now got its own content. Seems reasonable to me.

jbgilbert




msg:806762
 4:26 pm on Aug 7, 2005 (gmt 0)

For a site that has had a 301 redirect in place for a year, the "olddomain" is still listed in Google as a URL-only result. I guess that their spider tests the URL every so often to see if it still redirects or has gone 404 or has now got its own content. Seems reasonable to me.

Seems reasonable to me. Well, seems reasonable he would check, but not reasonable he would continue to index the "URL-only result".

The "URL-only result" you refer to is simply a URL with an associated page name (like index.html, etc.) which makes it really NO different than an internal page.

The SEs (and I hope they read this) should not (under a proper 301 re-direct) treat the default URL page any differently than they would the internal pages when it comes to showing them in the index.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved