Forum Moderators: phranque

Message Too Old, No Replies

multiple site consolidation

         

smallcompany

8:19 am on Jul 20, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There was a need to consolidate two websites into a new one, plus a subdomain from one of the two. It was like this:

oldexample1.com > newexample.com
oldexample2.com > newexample.com
sub.oldexample2.com > sub.newexample.com

I ended up by having a CP account where I created redirections (within the CP), as per the above outline. The new site is in a different account.

The .htaccess files for the three redirections look like this:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^oldexample1\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.oldexample1\.com$
RewriteRule ^(.*)$ "http\:\/\/www\.newexample\.com\/$1" [R=301,L]


Same principle is in the case of oldexample2 and the subdomain.

Then, I have redirects in the .htaccess of the new site, in this fashion:

RewriteRule ^something\.html http://www.newexample.com/something-or-something-new.html [R=301,L,NC]


I would also have lines like this:


RewriteRule ^some-sub/ http://www.newexample.com/some-or-new-sub/ [R=301,L,NC]


Basically , my idea was to redirect all from original sites to the new one - as is (via CP redirects), and than, if needed, redirect specific requests to new/consolidated pages (.htaccess of the new site).

After some time, in G WMT for the new site, I noticed G showing links form old non-existing sites to the new one. I thought that some time was needed for that to get sorted out at its own, but the number of links would only go down when I do certain "corrections" in the .htaccess of the new site.

Here is more about that:

[webmasterworld.com...]

The fellow WebmasterWorld member's suggestion has brought me here.

So, I figured, and please correct me if I'm wrong, that internal links from old pages got treated by G as real existing links. This would especially apply to, I'll call it, "opened" redirects like a subfolder without $ at the end.

Would anyone be able to explain what would be the best approach here when about redirects, please?

Besides having a website functioning correctly and without some indefinite loops (already achieved), my goal is to have all of those "links" from old sites being shown in WMT simply go away, as the only live site is the new one.

Thanks

TheMadScientist

9:44 pm on Jul 22, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I noticed G showing links form old non-existing sites to the new one.

I don't understand this part -- If the site really doesn't exist and hasn't existed, then it's a glitch on Google's part, otherwise, there was probably a link there at some point in time and Google's forgotten to forget about it, which isn't necessarily a glitch since "Google never forgets a page or link and randomly checks them both" seems to be their M/O.

...if needed, redirect specific requests to new/consolidated pages

This creates a "chained" or "stacked" redirects.

Example:
example.com/page-1 > newexample.com/page-1 > newexample.com/consolidated-page-1

There is a limit to "how far search engines follow" and, although this is only one extra "hop" on your site, it's a best practice to move the redirect(s) to the old sites.

One thing you could do that can help in the management of the redirects is to point the old domains to the new site's hosting account and the same root directory so the old sites run off the same .htaccess file as the new site. Then any/all redirects can be handled in the new site's .htaccess file and you don't have to maintain a separate hosting account and 3 .htaccess files.

It may take a bit of thought to set them all up on one initially EG to not redirect the new site's /contact page you'd likely have to use a %{HTTP_HOST} !^newexample\.com$ condition on the rule, but once you get them all in there and working, it makes maintaining them relatively easy.

Also:
RewriteRule ^(.*)$ "http\:\/\/www\.newexample\.com\/$1" [R=301,L]


You need to "escape" on the left side of rules and the right side of conditions, but not the right side of a rule (with a couple of exceptions).

FYC (Fixed Your Code) Below:
RewriteRule ^(.*)$ http://www.newexample.com/$1 [R=301,L]

smallcompany

8:07 am on Jul 23, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for reply!

I don't understand this part

Sorry, the two "old" sites existed, but do not exist now. They both redirect to the new site.

This is what G says for those links that I want to go away:

Via this intermediate link:

and this is how it looks as I drill down within WMT Links tab:

Overview » All domains » oldsite.com
>click> onto oldsite.com lists pages from the new site that, per G, oldsite.com links to.
>click> onto one of the pages shows this:

http://www.oldsite.com/subfolder/
Via this intermediate link:http://www.oldsite.com/other-subfolder/


In the case of this old site, G shows 25 pages that are being linked to (from old to the new site). All of these 25 links have the same URL oldsite.com/subfolder/ and than the intermediate link is a different one.

Now, the intermediate link is always a link I have in .htaccess of the new site, like this:

RewriteRule ^other-subfolder/?$ newsite.com/new-subfolder/ [R=301,L,NC]

Please note that I also have a redirect for oldsite.com/subfolder/ which is opened (no dollar sign at the end).

In the past I had more links like this. It looked like it was always a subfolder with an opened redirect being involved into creation of those multiple links that actually were always redirects.
As I would limit redirects to subfolders with the dollar sign (like this: ^subfolder/$), the number of links G would show would go down.

So, it sounds like I'm answering my own question about what to try for fixing it, but I really wanted to understand what causes G to treat those redirects like links.
Redirects I use are double as of now.
First, G visits the old site and gets 301ed to the new one.
Then, as the new site receives a request, there is either a page served if the name is the same, 404, or a single redirect to the final page.

What is causing this Via this intermediate link? All of these pages are simply gone.

For further reference, in the case of the second old site, the links shown are also subfolders, not a same one, but always subfolders.

Thanks

TheMadScientist

8:28 am on Jul 23, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It sounds to me like they're calling an intermediate redirect a link -- If not, it would probably help to see the mod_rewrite from both, but my best guess is they're following a link to oldsite.com/subfolder/, getting redirected, hitting an intermediate redirect either on oldsite.com or newsite.com, calling that a "link" rather than a redirect, and just plain confusing the situation, because that type of "confusion creation" is consistent with how they have worded and displayed things in WMT for years.

smallcompany

8:50 am on Jul 23, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thank you.

I'll continue with the fix on subfolders in question as I'm curious about if that will fully stop it.
The other way is what you've suggested, to maintain all redirects on original sites.

lucy24

6:06 pm on Jul 23, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Via this intermediate link:

I once had a thread called "via this intermediate figment of the imagination" because there was no other possible explanation. Like so many things in WMT, "ignore it and it will go away" really is the simplest and best option.

To compress everything in the present thread: Do everything you can to minimize multiple redirects. If olddomain/someURL is now newdomain/someotherURL, then redirect in one step, starting from the original request. Don't say
olddomain/someURL
>>
newdomain/someURL
>>
newdomain/someotherURL
... especially, of course, if "newdomain/someURL" has never actually existed.

Conditions like this
RewriteCond %{HTTP_HOST} ^oldexample1\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.oldexample1\.com$
can always be simplified to
RewriteCond %{HTTP_HOST} oldexample1\.com
with no anchors. The only time you can't do this is in the very rare case where you have two domains living on the same server and one name is contained within the other, like "example.com" and "old-example.com", or "example.co" and "example.com"

:: wandering off to satisfy idle curiosity about what countries are represented by .ne, .go and .or ::