Forum Moderators: phranque
I'm at my wits end with this one. Never been strong with .htaccess and this one has me baffled.
I'm trying to achieve having a site that uses geoip to direct uses to country pages. Eg domain.com > domain.com/us/ or domain.com/eu/
This works fine to achieve that
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(DE¦FR¦GB)$
RewriteCond %{REQUEST_URI} !^/eu/
RewriteRule ^(.*)$ /eu/$1 [R,L]
Good.
Now I want to use one set of pages in the root that each virtual directory pulls and then the content will be dynamically generated.
This code works fine for that bit
RewriteRule ^eu/(.*)[/]?$ /$1 [NC,L]
What's the problem then? Well I can't get them working together because it goes into a loop. I understand why it goes into the loop but I can't figure out how to get round it. Can anyone point me in the right direction?
Total code in .htaccess in root
RewriteEngine On
RewriteBase /
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(DE¦FR¦GB)$
RewriteCond %{REQUEST_URI} !^/eu/
RewriteRule ^(.*)$ /eu/$1 [R,L]
RewriteRule ^eu/(.*)[/]?$ /$1 [NC,L]
Hi and all that too :)
TIA
Adam
1 RewriteEngine On
2 RewriteBase /
3
4 RewriteCond %{REQUEST_URI} !^/eu/
5 RewriteCond %{REQUEST_URI} !^/pages/
6 RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(DE¦FR¦GB)$
7 RewriteRule ^(.*)$ /eu/$1 [R,L]
8
9 RewriteCond %{REQUEST_URI} !^/us/
10 RewriteCond %{REQUEST_URI} !^/pages/
11 RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^US$
12 RewriteRule ^(.*)$ /us/$1 [R,L]
13
14 RewriteCond %{REQUEST_URI} !^/pages/
15 RewriteRule ^eu(.*)[/]?$ pages/$1 [NC,L]
16 RewriteRule ^us/(.*)[/]?$ pages/$1 [NC,L]
17
18 rewriteCond $1 !^pages/
19 rewriteRule ^(.*)$ pages/$1 [L]
RewriteCond %{QUERY_STRING} !&?region=[^&]+&?
RewriteCond %{ENV:GEOIP_COUNTRY_CODE}>eu ^(DE¦FR¦GB)>(.+)$ [OR]
RewriteCond %{ENV:GEOIP_COUNTRY_CODE}>us ^(US)>(.+)$
RewriteRule ^(.*)$ /$1?region=%2 [QSA,L]
If you want to have all of these 'region' files in root, then you've got to either give them different names or append a 'region-identifier' query-string to the filepath, or do something to make each region-page's path different (as you did with the pseudo-path "/pages/"). Also, look at AcceptPathInfo, Content-Negotiation, etc. as well.
The bottom line is that if the 'output' path of Rule B matches the 'input' pattern of Rule A, and the 'output' path of Rule A matches the 'input' pattern of Rule B, then of course you get an 'infinite' rewriting loop in .htaccess.
Note: Replace all broken pipe "¦" characters in code you see posted here with solid pipes before use; Posting on this forum modifies the pipe characters.
Jim
The last code posted works fine. Rather than have a query string I want the url to look like below as that will look nicer in address bar.
domin.com/uk/
domain.com/us/
domain.com/eu/
These directories are all virtual only. The content for these pages all comes from the same files in real directory /pages/
Wont the example you give just give me domain.com/region=eu ?
That URL, when requested from a .eu IP address, will resolve to the DirectoryIndex-defined *file* referred to as "/" with a query string of "region=eu", but when
that URL is requested from a US IP address, it will resolve to the DirectoryIndex-defined *file* referred to as "/" with a query string of "region=us"
You could do the same thing pointing to /eu and /us as you did before. Or you could use "eu" and "us" subdomains (hint here).
My main point is that there is no reason to 'expose' the inner subdirectories or the inner mechanism of your site to the client with an external redirect. And if you feel you need an external redirect, then you really ought not to be linking to the domain, but rather to the 'region directories' anyway.
This thread has two aspects: First the usability and SEO design aspects of multi-region (multi-language?) sites, and second, implementation of those SEO and usability factors. I'm actually pushing you back a bit from implementation to design to be sure you've thought this through.
I'll also apologize for giving you 'scattered' responses -- replying to different aspects in the same thread without making too much of a fuss about changing the immediate subject... But here's another:
If you want to redirect a URL, but you don't want to redirect a previously-rewritten server-filepath, then you can check THE_REQUEST to be sure that the client asked for the path the you're testing with your rule pattern, and that it didn't arise as the result of a previously-invoked internal rewrite.
Be sure you're quite clear on those terms, too: External redirect vs. internal rewrite and URL vs. filepath. None are the same thing (or even similar, really).
Jim
Use the internal rewrite syntax instead of the external redirect syntax. An external redirect, by defintion, sends a response to the client (e.g. browser) that says, "The resource you requested has moved. Please ask for it again at this new URL." So the client (usually) updates its address bar, and issues a new HTTP request for what it wanted, but using the new URL provided in the redirect response.
An internal rewrite, in contrast, simply tells the server, "Instead of using the default URL-to-filepath translation of adding DocumentRoot to the requested URL-path, use DocumentRoot and this filepath instead." The client is not informed of any change in the URL-to-filename translation, and happily receives and displays whatever content the server sends back from that new filepath.
As I said above, rewrites, redirects, URLs, and filepaths... all are different, and this must be clear in order to understand this stuff. BTW, the primary purpose of an HTTP server is to translate URLs used on the Web into whatever filepaths are used by the operating system of the server, so that Web clients don't need to know anything about the filesystem inside the server. Mod_rewrite sits right at the 'boundary' between the URL-based Web and the server's filesystem, and can modify this URL-to-filename translation.
Jim
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/pages/
RewriteCond %{REQUEST_URI} !^/eu/
RewriteCond %{REQUEST_URI} !^/us/
RewriteCond %{REQUEST_URI} !^/eu
RewriteCond %{ENV:GEOIP_COUNTRY_CODE}>eu ^(DE¦FR¦GB)>(.+)$ [OR]
RewriteCond %{ENV:GEOIP_COUNTRY_CODE}>us ^(US)>(.+)$
RewriteRule ^(.*)$ /%2/$1 [R,L]
RewriteCond %{THE_REQUEST} !^/pages/
RewriteRule ^eu(.*)[/]?$ pages/$1 [NC,L]
RewriteRule ^us/(.*)[/]?$ pages/$1 [NC,L]
RewriteCond $1 !^pages/
RewriteRule ^(.*)$ pages/$1 [L]
The first block is using the code you supplied but have changed it to use country subdir. As I've put the R flag on it have I now made that external? If so I don't know any other way of updating the url in the address bar (which is what I want to happen).
Also is there a cleaner way to write all those RewriteCond for the pages?
The second block takes the info to be displayed from the pages subdir where all pages will be stored. Only one copy of a page for each page as content will be dynamically loaded (and as such may well put back the refer variable).
Last block is to make sure root reads the pages from pages subdir too.
Last thing is to update url if someone stumbles across pages so that it goes to root but I think I'm stuck on loops again. We'll see.
Thanks for your patience!
Adam
RewriteEngine on
#
# Externally redirect to remove trailing slashes from /en/<path>/ and /us/<path>/ URL-paths
RewriteRule ^(en¦us)/([^/]*)/$ http://www.example.com/$1/$2 [R=301,L]
#
# Externally redirect /<path> to /en/<path> or /us/<path> URLs based on geoip lookup
RewriteCond $1 !^(pages¦eu¦us)/
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^([A-Z]{2})$
RewriteCond %1>eu ^(DE¦FR¦GB)>(.+)$ [OR]
RewriteCond %1>us ^(US)>(.+)$
RewriteRule ^(.*)$ http://www.example.com/%2/$1 [R=301,L]
#
# Internally rewrite /en/<path> and /us/<path> URLs to /pages/<path>
RewriteRule ^(eu¦us)/(.*)$ pages/$2 [NC,L]
#
# Internally rewrite any remaining requests to /pages/<path>
RewriteCond $1 !^pages/
RewriteRule ^(.*)$ pages/$1 [L]
Replace all broken pipe "¦" characters with solid pipes before use; Posting on this forum modifies the pipe characters.
Jim
[edit] Corrections as noted below. [/edit]
[edited by: jdMorgan at 3:57 pm (utc) on July 22, 2009]
RewriteEngine on
#
# Rewrite URL based on cookie value
RewriteCond $1 !^(pages¦eu¦us)
RewriteCond %{HTTP_COOKIE} location=([^;]+) [NC]
RewriteRule ^(.*)$ /%1/$1 [R,L]
#
# Externally redirect to add trailing slashes from /en/<path> and /us/<path> URL-paths
RewriteRule ^(eu¦us)(.*[^/])?$ http://domain.com/$1$2/ [R=301,L]
RewriteRule ^(.*)(eu¦us)/$ - [co=location:$2:.domain.com:2592000:/]
#
#Externally redirect /<path> to /en/<path> or /us/<path> URLs based on geoip lookup
RewriteCond $1 !^(pages¦eu¦us)
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^([A-Z]{2})$
RewriteCond %1>eu ^(DE¦FR¦GB)>(.+)$ [OR]
RewriteCond %1>us ^(US)>(.+)$
RewriteRule ^(.*)$ /%2/$1 [R,L]
#
# Internally rewrite /en/<path> and /us/<path> URLs to /pages/<path>
RewriteRule ^(eu¦us)(.*)[/]?$ pages/$2/ [NC,L]
#
# Internally rewrite any remaining requests to /pages/<path>
RewriteCond $1 !^pages/
RewriteRule ^(.*)$ pages/$1 [L]
Good stuff :)
It's very likely that you could add the "cookie-setting flag" to your now-fourth rule, and eliminate the third rule.
I'd also recommend validating the cookie value in your first rule using ";?location=(en¦us);?", because cookies can be spoofed on the client side.
Jim
Make sure that the page that is being requested when you intend to set the cookie or to check "$_Cookie" has been marked with proper cache-control headers, so that the browser *must* send a request to the server after the cookie has been set. Using "Cache-control: no-cache, must-revalidate" might fix your problem.
Note that the "no-cache" attribute doesn't mean exactly what it sounds like; Due to errors, misinterpretations, and liberties taken on the early Web, all this header will do is to make the client actually send a request to your server if the visitor's browser re-loads the page for any reason. In most cases, this will be a request with an "If-Modified-Since" header, and if you are currently returning Last-Modified headers, then your server will not send back the page, it will only send back a "304-Not Modified." But, the cookie header will also be sent in that response.
As you can see, this gets complicated... :)
Jim