Forum Moderators: phranque
Can't seem to figure this outˇK
My site, for example, www.site.com is configured to do multi-language with gettext and PoEdit (.po, .mo) files. So all multi-language links are something like:
www.site.com/?locale=de_DE
www.site.com/products/index.php?locale=de_DE
www.site.com/?locale=zh_CN
www.site.com/products/index.php?locale=zh_CN
now, what I want to do is to make the above two url to LOOK LIKE (while it's actually really running the above links):
www.site.com/de_DE/
www.site.com/de_DE/products/
www.site.com/zh_CN/
www.site.com/zh_CN/products/
I emphasize on LOOK LIKE because I want images with relative paths in the html to work. For example <img src="images/title.png" /> to look for the image without the de_DE added to the path.
I tried in .htaccess:
<FilesMatch "^(de_DE¦zh_CN)">
ForceType application/x-httpd-php
</FilesMatch>
and made a de_DE and a zh_CN file to include the target php files. It nearly worked, but it's actually running the de_DE (zh_CN) file, so the de_DE (zh_CN) is added to the path of the images with relative paths, causing it to not find the images (and the solution was messy too).
How should I do this?
Any help would be appreciated, thanx in advance.
instead of making the url do the LOOK LIKE I mentioned above, I should say that I would like
'www.site.com/zh_CN/products/' to actually run 'www.site.com/products/?locale=zh_TW'
'www.site.com/zh_CN/products/index.php' to actually run 'www.site.com/products/index.php?locale=zh_TW'
'www.site.com/zh_CN/' to actually run 'www.site.com/?locale=zh_TW'
'?locale=zh_TW' can still work, no need to do redirect or anything.
Once a URL in a link is clicked and is requested from your server, you can configure the server (in httpd.cof, conf.d, or .htaccess, etc.) to deliver the content *associated* with that URL from any place in the server's filesystem that you choose.
So links on pages define URLs, and the structure and names you choose define the files and directories inside the server. Once a URL is defined on a page you cannot "change it" and you cannot make it "look like" something else.
To further clarify, URLs exist on the Web, and are defined by Web pages. Files and filenames exist inside the server. The server's primary and fundamental purpose is to translate Web requests for URLs into server operating system requests for files (including scripts). So the server sits at the boundary between the Web-world and the operating system world, and translates between the two. And its primary function is to allow a "universal" and uniform resource-location (U.R.L.) system to exist that is independent of the server software, operating system, and filesystem architecture inside the server. This allows URLs to "work" regardless of whether the server is Apache, IIS, or something else, and whether the operating system is Unix, Linux, HP-UX, Solaris, Windows Server, or anything else.
As a result, changing your URL involves changing the source code of the page that defines that URL (or changing the code of the script that produces the page that defines that URL). Once the URL 'looks like' what you want it to look like, then you configure your server to locate the proper file or script to serve the proper content in response to requests for that URL.
Jim
'?locale=zh_TW' can still work, no need to do redirect or anything.
Actually, there *is* a need to redirect that -- You should redirect such old URLs to the new 'friendly' ones. Otherwise, you will have a duplicate-content problem; two URLs returning the same content, and essentially competing with each other for ranking.
This thread [webmasterworld.com] from our forum library describes all three steps of switching to 'friendly' URLs: The link URL changes, the URL-to-filename rewrites, and the old-URL-to-new URL fixup redirects to speed search engine listing updates and prevent duplicate content.
Jim
Thank you for replying. I now have a better idea in the direction I should be heading. I suck at mod_rewrite though, here is my feeble failed attemptˇK
Below line seem to make 'www.example.com/zh_CN/index.php' and 'www.example.com/zh_CN/products/test/' work:
RewriteRule ^(tw¦cn¦de)/(.*) $2?locale=$1 [L]
however, I failed to redirect all '...?locale=zh_CN' to '/zh_CN/...'
tried it with:
RewriteRule ^(.*)\?locale=(.*) http://www.example.com/$2/$1/index\.php? [R=301,L]
but failedˇK
What did I do wrong?
Again, any help from anyone would be great. Thanx!
[edited by: jdMorgan at 1:54 pm (utc) on Jan. 21, 2009]
[edit reason] Please use example.com [/edit]
Also it's not clear whether your first rule worked; I would guess it did not, because the pattern does not match the requested URL (the "_CN" is missing, for example).
I suggest that you get the first rule working completely before trying to add or debug the second rule.
Jim
Jim
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?product=([^&]+)&color=([^&]+)&size=([^&]+)&texture=([^&]+)&maker=([^\ ]+)\ HTTP/
'RewriteCond %{THE_REQUEST}' I understood
'/index\.php\?product=([^&]+)&color=([^&]+)&size=([^&]+)&texture=([^&]+)&maker=([^\ ]+)' I understand too
These two are what I don't get:
'^[A-Z]{3,9}' - truly no idea what this is doing here… ^-start of line, [A-Z]-one char between capital A-Z, {3,9}- no idea…
'\ HTTP/' - is this to limit only from http request?
Tried to google some guidance, but the examples I saw were all similar to 'RewriteCond %{THE_REQUEST} /index.php\?' only. Only one part after the REQUEST.
GET /products/?locale=zh_TW HTTP/1.1
So the [A-Z]+ matches the the HTTP methods (GET, HEAD, POST, etc.), then we have the URL-path and any appended query string, and finally the HTTP version.
So in your case, to match that request and create the %1 back-reference for use in the RewriteRule substitution URL, you might use
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /products/(index\.php)?\?([^&]*&)*locale=([^&\ ]+)[^\ ]*\ HTTP/
RewriteRule ^products/(index\.php)?$ http://www.example.com/%2/products/? [R=301,L]
This rule will work with appended query strings which contain "locale=". If additional name/value pairs precede or follow "locale=" the rule will still be invoked, but those name/value pairs will be dropped. This may be helpful if someone links to you with intentionally-bogus extra name-value pairs in the query string.
You may want to add additional rules to handle the case where "/products/index.php" or "/products/" is requested without the "locale=" name/value in the appended query string, or with a blank value for locale. Search engines and unfriendly competitors may try those invalid URLs, so it would be wise to handle them in some known way.
Note that in both the RewriteCond and RewriteRule above, the "index.php" is optional, so that both versions of the old URLs are detected and redirected. This will further reduce your duplicate-content problem.
Jim
So we're parsing against 'GET /products/?locale=zh_TW HTTP/1.1'. That makes a lot more sense now.
It's still now working though…
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /products/(index\.php)?\?([^&]*&)*locale=([^&\ ]+)[^\ ]*\ HTTP/
RewriteRule ^products/(index\.php)?$ http://www.example.com/%2/products/? [R=301,L]
I tried www.site.com/products/test/index.php?locale=zh_CN
and no redirection is happening
Also, I am trying to redirect basically all pages (all pages have translations), not just the ones under /products
so basically:
www.site.com/...........?locale=<language> becomes www.site.com/<language>/...........
so I tried this first with index.php:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(index\.php)?\?([^&]*&)*locale=([^&\ ]+)[^\ ]*\ HTTP/
RewriteRule ^/(index\.php)?$ [site.com...] [R=301,L]
um… not working… :(
And you're right, I shouldn't use the index.php. Assuming if the previous lines worked, then it should have been:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ \?([^&]*&)*locale=([^&\ ]+)[^\ ]*\ HTTP/
RewriteRule ^?$ [site.com...] [R=301,L]
Not sure…
Also, I don't quite get the RewriteRule part. I thought it works like this RewriteRule <a> <b> where <a> is replaced by <b>. From the example you gave me, it seems to be different when using with RewriteCond.
Steve
The patterns must exactly match the requested URL-path, or the rule won't run.
Please nail down exactly what URLs you want to rewrite and redirect, and then I'll be glad to further assist.
Jim
Sorry, didn't make myself clear. Basically, the site is with regular php files, and each and every single page would have translations through gettext(). All pages are using the english version as a template, and with the ?locale=<language> (always at the end), php would just substitute all text into another language.
Instead of having to see the ?locale=<language> at the back, I want to make is so that just by adding <language> behind the www.site.com/ and append /<whatever at the end of the url>.
Like: www.site.com/<something>?locale=<language> => www.site.com/<language>/<something>
For example:
www.site.com/?locale=<language> => www.site.com/<language>/
www.site.com/<language>/about.php?locale=<language> => www.site.com/<language>/about.php
www.site.com/<language>/products/?locale=<language> => www.site.com/<language>/products/
www.site.com/<language>/about/manufacture.php?locale=<language> => www.site.com/<language>/about/manufacture.php
www.site.com/<language>/about/?locale=<language> => www.site.com/<language>/about/
www.site.com/<language>/products/index.php?locale=<language> => www.site.com/<language>/products/index.php
www.site.com/<language>/products/spec.php?locale=<language> => www.site.com/<language>/products/spec.php
www.site.com/<language>/products/test/index.php?locale=<language> => www.site.com/<language>/products/test/index.php
In other words, take the whole url, remove ‘locale=<language>’, add ‘<language>/’ to behind www.site.com/.
Thanx so much for your patience.
Steve
OK, perhaps something like this:
# Internally rewrite new friendly URLs to script with locale as query string
RewriteRule ^(zh_TW¦zh_CN¦de_DE)/(.*) $2?locale=$1 [L]
#
# Externally redirect direct client request for old query string URLs to new friendly URLs
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^?]*\?locale=(zh_TW¦zh_CN¦de_DE)\ HTTP/
RewriteRule (.*) http://www.example.com/%1/$1? [R=301,L]
Jim
Thanx for the help! It's working.
if you don't mind, I would like to ask a few more questions:
RewriteRule (.*) http://www.example.com/%1/$1? [R=301,L]
can we just totally remove http://www.example.com?
Why the www.example.com?
Also, what would you suggest for readings with these RewriteCond, RewriteRule? I read the links you sent me. It's laid out the big picture nicely. Do you have any more links that goes into the nitty-gritty parts? The Apache ones are too cryptic to me. I'm look for something with lots of examples.
Thanx again.
Steve
I strongly recommend that you DO NOT remove it. If you do remove it, then Apache will "guess" about the correct domain name, and may choose the wrong one. It's not actually a guess, but Apache will use the canonical name of the server as defined by the server configuration file, instead of your preferred domain name. So, you may end up redirecting to the non-www, for example, when your links all point to "www".
So, I suggest putting the domain name that you use for the majority of your links in the RewriteRule substitution URL.
You did not say *why* you feel a need to remove it, so I can't answer in more detail.
Mod_rewrite *is* cryptic and complicated, so the documentation reflects this. There are several books about it available from places like Amazon.com, and of course, we have loads of threads here in the forum, and some tutorials in our Library.
Jim
1. dev.sit.com on my MacBook Pro for development (through httpd.conf and host, that address works only on my computer)
2. ###.###.###.###:9000 on a test server at work for colleagues to test (sync with MacBook Pro through svn)
3. and the real www.site.com
Instead of remembering to modify .htaccess every time I sync or upload, I'm trying to find a setting that would work for all three computers.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^?]*\?locale=(tw¦cn¦de)\ HTTP/
RewriteRule (.*) %{HTTP_HOST}/%1/$1? [R=301,L]
(Oh, I simplified the language extension to just 2 letters)
Would that work? I tried it. Had a hiccup after I saved the .htaccess (language switching stopped working for a few clicks until I switched to the root dev.site.com/tw/), but seems to work now (on my dev MacBook Pro, no access to machine 2 until the Chinese New Year is over, and really don't want to try on the real site yet). Is there something bad with this I haven't seen yet? That initial hiccup really got me nervous.
I further modified the command to:
RewriteRule ^([a-zA-Z]{2})/(.*) $2?locale=$1 [L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^?]*\?locale=([a-zA-Z]{2})/\ HTTP/
RewriteRule (.*) %{HTTP_HOST}/%1/$1? [R=301,L]
This would also eliminate the need to add language extension every time. Again, tested on my MacBook Pro only, seems to work. What do you think? Looks good?
Thank you so much for your help. Through the conversation with you, I've learned so much. Reading related articles seems much easier now then before.