Forum Moderators: phranque
Here is what I am trying to achieve - my site will have artificial language directories ie en and fr.
For example :
www.example.com/en/request
or
www.example.com/fr/request
What I am trying to do is to redirect to my language default if anything else is requested - ie:
www.example.com/de/request
should redirect to
www.example.com/en/request
After this has been done, my rewrite rule comes in to action :
RewriteRule ^(en¦fr)/(.*)$ $2?lang=$1
I am having big problems putting in place a condition which will test for two language code first level directory requests which are not en or fr.
The result was either no effect or internal server error - which is why I am not daring to post it here.
Having searched through the forum and read the documents indicated in the charter, I am still no further forward.
Can anyone help here ?
Here is what I am trying :
# validate that any language subdirectory request is valid
RewriteCond %{REQUEST_URI} ^!(en¦fr)
RewriteRule ^([^/]+)/(.+)$ /en/$2 [L]
Which allows all requests with en or fr to go through ok.
However, if I use another language code (ie de), I get a 404 not found response.
# Internally rewrite subdirectory requests other than /en or /fr to force /en
RewriteCond %{REQUEST_URI} !^/(en¦fr)/
RewriteRule ^[^/]+/(.+)$ /en/$1 [L]
# Internally rewrite subdirectory requests other than /en or /fr to force /en
RewriteCond $1 !^(en¦fr)$
RewriteRule ^([^/]+)/(.+)$ /en/$2 [L]
You might want to consider, however, that this code now "outlaws" the use of any subdirectories except for /en and /fr. This will likely be a problem if your site is successful and grows, since you'll have to keep all content in either /en, /fr, or root. What about 'shared' resources, such as images, scripts, and CSS stylesheets -- do you really want to have to keep all of this in the root, or to have to duplicate it into the language subdirectories? There are also some "well-known-location" subdirectories such as the "/w3c" subdirectory for privacy-policy file storage that will break with this rule.
So you might consider being more specific in the rule, and rewriting *only* two-letter, all-alphabetic subdirectories which are not /en or /fr, leaving anything else alone:
# Internally rewrite two-letter (language) subdirectory requests other than /en or /fr to force /en
RewriteCond $1 !^(en¦fr)$
RewriteRule ^([a-z]{2})/(.+)$ /en/$2 [L]
Finally, you should consider whether you really want an internal rewrite, or should externally redirect any "wrong language-code" URL requests to /en instead. If you don't do this, you could end up with many, many duplicates of your English pages in search results, since mis-typed links (e.g. on other sites or in forums, blogs, etc.) to /<anything-but-en-or-fr>/<anything> will otherwise directly return an English page. These duplicates would 'dilute' the ranking of the corresponding English pages. I would recommend using an external redirect instead of an internal rewrite for this reason.
# Externally redirect two-letter (language) subdirectory requests other than /en or /fr to force /en
RewriteCond $1 !^(en¦fr)$
RewriteRule ^([a-z]{2})/(.+)$ http://www.example.com/en/$2 [R=301,L]
Thankyou for this clarification. Good to be able to talk with a real expert!
One further question, however. Taking your last recommendation - as I develop on my local system before uploading to the host, I use a local virtual host entry. This means that the URL used would be different to the public URL.
Would it be equally efficient (is it even possible) to use the host header server variable {HTTP_HOST} in the place of the domain name?
Simon
RewriteCond %{HTTP_HOST} !=""
Jim