Forum Moderators: phranque
I have been having some issues with a rewrite for a few days. The issues is we have sites all over the world and the Dev team did the site one way and discovered that it was not working very well for the google crawl. They updated the site to a whole new site and now I have to figure out how to fix this.
This is what we had before
[thesite.com...]
the /us/ was being added to the url via the app (string).
In the us it will change to /en-us/. This is an easy fix. You just add the following and bam done.
RewriteRule ^/us/?(.*) /$1 [R=301,L]
However this is where I have the problem. We also have all these domains.
[thesite.com...] = /de-de/
[thesite.com...] = /en-us/
[thesite.com...] = /fr-fr/
[thesite.com...] = /es-es/
[thesite.com...] = /ru-ru/
So what would happen is as you can tell I would end up in a loop. It will remove the second and then add -**/ at the end.
I have tried to put in an exclude but it is not catching it. I am at an imp-ass.. Any ideas how to do this?
2.) It is always advised to use the full URL when performing an external redirect.
3.) In the httpd.conf file you will need the leading / on the left side of the rule, but in the .htaccess you will not.
# .htaccess ruleset:
RewriteRule ^us/(.*) http://www.example.com/en-us/$1 [R=301,L]
RewriteRule ^([a-z]{2})/(.*) http://www.example.com/$1-$1/$2 [R=301,L]
# httpd.conf ruleset:
RewriteRule ^us/(.*) http://www.example.com/en-us/$1 [R=301,L]
RewriteRule ^([a-z]{2})/(.*) http://www.example.com/$1-$1/$2 [R=301,L]
http://www.example.com/en/ = /en-us/
RewriteRule ^(us¦en)/(.*) http://www.example.com/en-us/$2 [R=301,L]
The ¦ means OR and should be a solid bar, not broken, so make sure you edit it if copying and pasting.
I'm also not sure as to why a condition is suggested, since the redirects can be accomplished in two rules, which must match prior to a condition being tested, so it seems to me adding a condition would just add processing and decrease efficiency, but maybe I'm missing something and someone could provide a more efficient example than mine.
RewriteCond %{REQUEST_URI} !^/en-gb [NC]
RewriteCond %{REQUEST_URI} !^/en-us [NC]
RewriteRule ^/en/?(.*) http://www.example.com/en-gb/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/de-de [NC]
RewriteRule ^/de/?(.*) http://www.example.com/de-de/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/fr-fr [NC]
RewriteRule ^/fr/?(.*) http://www.example.com/fr-fr/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/es-es [NC]
RewriteRule ^/es/?(.*) http://www.example.com/es-es/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/ru-ru [NC]
RewriteRule ^/ru/?(.*) http://www.example.com/ru-ru/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/en-gb [NC]
RewriteCond %{REQUEST_URI} !^/en-us [NC]
RewriteRule ^/en/? http://www.example.com/en-gb/ [R=301,L]
#
RewriteCond %{REQUEST_URI} !^/$1-$1 [NC]
RewriteCond %{REQUEST_URI} !^/en- [NC]
RewriteRule ^/([a-z]{2})/? http://www.example.com/$1-$1/ [R=301,L] OR
RewriteCond %{REQUEST_URI} !^/en-gb [NC]
RewriteCond %{REQUEST_URI} !^/en-us [NC]
RewriteRule ^/en/? http://www.example.com/en-gb/ [R=301,L]
#
RewriteCond %{REQUEST_URI} !^/$1-$1 [NC]
RewriteCond %{REQUEST_URI} !^/en- [NC]
RewriteRule ^/(de¦fr¦es¦ru)/? http://www.example.com/$1-$1/ [R=301,L] There's also no need for the (.*) backreference as you don't re-use that data anywhere.
Unfortunately, this isn't likely to work because you can't use variables on the right side of a RewriteCond. It's one of those things that I really wish *did* work, as there is no way to do variable-to-variable compares inside mod_rewrite except to use atomic back-references (if supported by your regex library) and take advantage of commutativity (e.g. if A+A = A+B, then A=B). In this case, doing that would be more complex than the brute-force string compare, so brute-force is both simpler and more portable from server to server (and/or from version to version).
It seems to me that the rules could still be optimized in a different way to eliminate most or all RewriteConds with a better RewriteRule pattern, but the 'requirements' for the rules have apparently changed since the first post, and I can't tell what they really are -- for example, the path following the language-codes has apparently now been dropped, whereas it was preserved by the earlier code.
Jim
You are right. The requirement has changed. With the full rewrite of the site they had decided they didn't want to retain about 50% of the site. With that they decided they wanted most to just end up at the main site (index) of the site. So i dropped the part to hold or retain any of the url that was entered. I want people to get the site not a 404. So far this has been working great for us.
A 404 error page should briefly (and somewhat apologetically) acknowledge and describe the problem, and then offer text links to your home page, site map, major category pages, and site search facility as applicable. Keep the list of links short to avoid confusion -- no more than seven.
A long-delay on-page meta-refresh from the 404 error page to your home page is acceptable. Do not try to make it fast, as it will be handled as a 302 redirect if you do -- with disastrous results in the SERPs. Instead, follow the suggestion above, and allow plenty of time for the user to read the information presented and to make an informed decision (eight to 15 seconds minimum). If you do use a meta-refresh, the page should say so ('warn' the user that he/she'll be redirected after xx seconds).
Jim