Forum Moderators: Robert Charlton & goodroi
...
A domain name change from example.com to example.tld
(Non-seo decision, say there's a site that's aiming for a more regional feel.)
- example.com had canonical redirects internally, from non-www to www etc.
- a lot of links pointed to the non-www version ( optimization came in late )
- the new domain will also have canonical redirects of course
- site, URL structure, all being the same as the original
which of the below two practices is the correct one?
...
Scenario 1. - 2 steps
Redirect all traffic to new domain, canonize there
set up a single 301 effectively replacing the .com with the .tld in the request, redirecting all traffic to the new domain, and sort everything out there.
Incorrect URLs redirect in 2 steps
e.g.:
example.com --> example.tld --> www.example.tld
www.example.com/index.html -->
www.example.tld/index.html -->
www.example.tld/
example.com/page.html -->
example.tld/page.html -->
www.example.tld/page.html
...correct URLs redirect in 1 step:
www.example.com/page.html -->
www.example.tld/page.html
...
Scenario 2. - 1 step
Set up the .com canonization rules to 'correct URLs' redirecting with the new .tld
modify rules one by one, i.e. when a request comes in that'd have to be corrected, it gets redirected to the singled out URL version ( as it was on the .com ) only that it goes to the .tld. A single *new* rule added: if the request didn't trigger any of the canonical rules, it gets redirected in the end to the corresponding page at example.tld
All URLs redirect in 1 step.
e.g.
example.com --> www.example.tld
www.example.com/index.html -->
www.example.tld/
example.com/page.html -->
www.example.tld/page.html
www.example.com/page.html -->
www.example.tld/page.html
...
...
Don't know why but I almost said 'the first one', because it might be easier to figure out for bots. But do bots even care? If there's no such thing as a signal for 'moving a domain' there's really no reason to NOT choose the 2nd.
...
Have you ever tested these aspects against one another?
Which scenario is easier to sort out for users, bots, etc?
I got myself confused now and have no idea anymore which is better *grin*
Thinking of it, it just seems a *really* basic issue.
Please tell me if there's a correct answer to this... *hehe*
However, since a redirect requires that you explicitly state the destination domain, it would take extra code to *not* canonicalize the subdomain (www) at the same time as the domain name and tld.
As you implied in your post, do the most-specific redirects first, and then finally the subdomain-domain-tld redirect as the catch-all at the end.
example.com/index.html --301--> www.example.tld/
www.example.com/index.html --301--> www.example.tld
example.com/page.html --301--> www.example.tld/page.html
www.example.com/page.html --301--> www.example.tld/page.html
example.com --301--> www.example.tld
www.example.com --301--> www.example.tld
If you are using mod_rewrite, and the new tld is hosted on a separate server (whether actual or virtual), you can do this using only two rules. In example.com/.htaccess:
RewriteRule ^index\.html$ http://www.example.tld/ [R=301,L]
RewriteRule (.*) http://www.example.tld/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html
RewriteRule ^index\.html$ http://www.example.tld/ [R=301,L]
#
RewriteCond %{HTTP_HOST} !^(www\.example\.tld)?$
RewriteRule (.*) http://www.example.tld/$1 [R=301,L]
( or rather... once I stop thinking sci-fi and get down to Earth )
If you are using mod_rewrite, and the new tld is hosted on a separate server (whether actual or virtual), you can do this using only two rules.
It's mod_rewrite allright, but some other details aren't yet known ( to me at least ) so your advice for both possible cases are most welcome... thanks again.
that aside, what to do with 404's of the old domain... I keep hearing people take extra precautions nowadays *smirk*
e.g.
It's the exact same directory ( i mean exact same files ). Check if available / if not, and the request was made to the old domain -> serve a 404 response from there before the request'd reach the 301's...
if the request was for the new domain, 404 will be sorted out anyway.
Always thought of this as overdoing it though. In the end a 301->404 is still a 404, which - in small quantities - never caused a problem at Google before. On the other hand this site might have thousands(+) of inbound links to hundreds(+) of unavailable URLs, making Googlebot want to come back again and again.
...yet I feel like leaving this rule out for good...
anyone using such a precaution? ( w/ both 'domains' being the *same* directory on the server...) requests made to the 'old', but unavailable on the 'new' domain being served a 404 up front?
Often they are obvious typos for URLs that do still exist. A common example is where a forum or CMS has auto-link generation from typed URLs. User types a URL at the end of a sentence, immediately followed by a full stop, and the dumb auto-link software includes the full stop as if it were a part of the URL.
Your site now has an incoming link to www.yoursite.com/somepage.html. so in that case I would set up a redirect to catch all such errors. There are a number of other such examples, your logs will show you what they are. It's up to you which you choose to rescue and which you choose to ignore.
however...
Most of these pages are either of the following:
- deleted user profile pages and their different views, subpages
- errors in incoming links to such pages ( unicode URL encoding related )
...and with some ~40,000 of 404's (!) it's probably impossible to root them out at this point.
Question is whether they'd want to redirect all this lot from the old domain to the new one and serve the 404 *there* ... or since both domains are served from the same dir... they could check if the URI would be available and if not, serve the 404 up front from the old domain before it'd even redirect to the new one.
...
If there wasn't 40k of reported 404s I'd ignore the problem but this seems crazy. ( Mind you the site is huge, a lot of users, a lot of moderation going on, hence 404 aren't that rare )
What would you do?
At least try to get a handle on the most "popular" URL formats as far as either numbers or traffic goes and craft a redirect rule for everything having that format, and get the traffic to somewhere on the new site that is going to be at least vaguely useful to the visitor.
I would also redirect at least some of the links that come from big hitter websites as far as PageRank goes. Last month, I worked on a site that had four typoed incoming links that it served a 404 error for - and those links were incoming from PR 7 and 8 pages, and one was from a .edu site. Madness for a small site to throw that effect away in a 404.