Forum Moderators: phranque

Message Too Old, No Replies

modifying an ENV var in htaccess + optimizing

doing rewrites based on some vars (if else contruction + modifying needed)

         

Salami1_1

1:14 pm on Oct 20, 2009 (gmt 0)

10+ Year Member



Hi,

I'm new to the Env vars as I just discovered this while searching for some solutions for my website. My website is going to be on 2 (cloud) servers. 1 in the US and 1 in the UK. They are exactly the same. Now I'm using htaccess to redirect visitors to correct version (based on ip-country db via a MOD) to serve them with fastest website.

Now to select the best server for the visitor I have in my .htaccess:


#server select
RewriteCond %{ENV:IP2LOCATION_COUNTRY_SHORT} ^(UK¦NL¦BE¦FR¦DE¦ES¦AT¦CH¦PT¦IE¦AD¦MC¦DK¦IT¦NO¦SE¦PL¦CZ¦SI¦HU¦HR¦UA¦RO¦EE¦LV¦LT¦RU¦GR¦BG¦TR¦IQ¦IR¦SA¦EG¦DZ¦MA¦TN¦NE¦FI¦IS¦PK¦AF¦IN)$
RewriteCond %{HTTP_HOST} !^(www\.)?mywebsite\.eu [nc] #prevent looping
RewriteRule ^(.*)$ http://www.mywebsite.eu$1 [NC]

RewriteCond %{ENV:IP2LOCATION_COUNTRY_SHORT} !^(UK¦NL¦BE¦FR¦DE¦ES¦AT¦CH¦PT¦IE¦AD¦MC¦DK¦IT¦NO¦SE¦PL¦CZ¦SI¦HU¦HR¦UA¦RO¦EE¦LV¦LT¦RU¦GR¦BG¦TR¦IQ¦IR¦SA¦EG¦DZ¦MA¦TN¦NE¦FI¦IS¦PK¦AF¦IN)$
RewriteCond %{HTTP_HOST} !^(www\.)?mywebsite\.com [nc] #prevent looping
RewriteRule ^(.*)$ http://www.mywebsite.com$1 [NC]


However this would mean every page requested on any of the servers creates a lookup, which is kinda redundant. Does anybody have any tips on how do to this 'server select' only once per session for example? Perhaps with setting a cookie or something?

2nd question is as following:
I have some redirects to get to the right language version I have 3 languages at the moment and they are all 3 available on both servers (I mean there are people speaking Spanish in US as well as in EU). So I'm using subdomains for the language version and for this I do some redirection might a customer go there using wrong url


RewriteRule ^en/(.*)$ http://%{HTTP_HOST}/$1 [R=301,L]
RewriteRule ^nl$ http://nl.%{HTTP_HOST}/ [R=301,L]
RewriteRule ^es$ http://es.%{HTTP_HOST}/ [R=301,L]
RewriteRule ^nl/(.*)$ http://nl.%{HTTP_HOST}/$1 [R=301,L]
RewriteRule ^es/(.*)$ http://es.%{HTTP_HOST}/$1 [R=301,L]

The problem is above code will NOT work as HTTP_HOST will include the 'www' part and [es.domain.com...] is not gonna work of course. So I actually need to make a Env var which strips the 'www' part out of HTTP_HOST but I have no clue on how to do this. I've seen how to set your own variable in .htaccess but the stripping part is the problem.

Thank you very much for your advise.

Best regards,

Salami1_1

1:31 pm on Oct 20, 2009 (gmt 0)

10+ Year Member



hmm

was just thinking if I just do this:


RewriteCond %{HTTP_HOST} !^(www\.)?mywebsite\.eu [nc] #prevent looping
RewriteCond %{ENV:IP2LOCATION_COUNTRY_SHORT} ^(UKŠNLŠBEŠFRŠDEŠESŠATŠCHŠPTŠIEŠADŠMCŠDKŠITŠNOŠSEŠPLŠCZŠSIŠHUŠHRŠUAŠROŠEEŠLVŠLTŠRUŠGRŠBGŠTRŠIQŠIRŠSAŠEGŠDZŠMAŠTNŠNEŠFIŠISŠPKŠAFŠIN)$
RewriteRule ^(.*)$ http://www.mywebsite.eu$1 [NC]

RewriteCond %{HTTP_HOST} !^(www\.)?mywebsite\.com [nc] #prevent looping
RewriteCond %{ENV:IP2LOCATION_COUNTRY_SHORT} !^(UKŠNLŠBEŠFRŠDEŠESŠATŠCHŠPTŠIEŠADŠMCŠDKŠITŠNOŠSEŠPLŠCZŠSIŠHUŠHRŠUAŠROŠEEŠLVŠLTŠRUŠGRŠBGŠTRŠIQŠIRŠSAŠEGŠDZŠMAŠTNŠNEŠFIŠISŠPKŠAFŠIN)$
RewriteRule ^(.*)$ http://www.mywebsite.com$1 [NC]


(replace the 'prevent looping line' to come before the country look up)

it would already prevent the unnecessary look up right? If the first RewriteCond is 'false' does it still execute the 2nd RewriteCond?

Thanks

jdMorgan

3:19 pm on Oct 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> If the first RewriteCond is 'false' does it still execute the 2nd RewriteCond

Be careful with the logic terminology here. Your first RewriteCond in the second rule will evaluate as True if the host is NOT mywebsite.com or www.mywebsite.com.

You've actually got an example there that is close to answering your question about how to get the domain name without the "www". If you want to 'extract' and back-reference the domain and tld, something like:


RewriteCond %{HTTP_HOST} ^(www\.)?(example\.com)
RewriteRule ^en/(.*)$ http://en.%2/$1 [R=301,L]

would do that. Note that this code is not something you need to use, it is just an example of grabbing the domain+tld in local variable %2, and using it in a rule.

For you earlier question, note that mod_rewrite can check HTTP_COOKIE before doing any of the redirects, and if the cookie contains a location and/or language preference, use that info instead of ip2location to present the correct localized content.

On Apache 2.x, mod_rewrite can also set cookies using the [CO=] flag in a rewriterule, although there are other methods which may be more efficient.

Jim

Salami1_1

6:27 pm on Oct 20, 2009 (gmt 0)

10+ Year Member



Hi,

Thanks for the answer!
This is now working smoothly:


RewriteCond %{HTTP_HOST} ^(www\.)?(mywebsite\.(com¦eu))
RewriteRule ^nl/(.*)$ http://nl.%2/$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^(www\.)?(mywebsite\.(com¦eu))
RewriteRule ^nl$ http://nl.%2/ [R=301,L]

But I found out that 404 using:


ErrorDocument 404 http://%{HTTP_HOST}/404.php

is redirecting all 404 request to [%{HTTP_HOST}...] (literally as actuall URL). Does HTTP_HOST not work for 404 or something?

As for the 'look up optimization';
Im a bit confused due to your remark about terminology :) but I made a small scenario and I'm sure its correct what I wrote down as for it working :)

-------
mywebsite.com (ip from NL)
true
true
redirect -> .eu

mywebsite.eu
false
-? (true)
- (no redirect)

true
false
- (no redirect)
----------

But I'm still sitting with the question;
does the second line from 1st rule (the look up; RewriteCond %{ENV:IP2LOCATION_COUNTRY_SHORT} ^(UKŠNLŠBEŠFRŠDEŠESŠATŠCHŠPTŠIEŠADŠMCŠDKŠITŠNOŠSEŠPLŠCZŠSIŠHUŠHRŠUAŠROŠEEŠLVŠLTŠRUŠGRŠBGŠTRŠIQŠIRŠSAŠEGŠDZŠMAŠTNŠNEŠFIŠISŠPKŠAFŠIN)$ ) actually get executed when the first conditions (RewriteCond %{HTTP_HOST} !^(www\.)?mywebsite\.eu [nc]) is false ?

If yes how can you 'skip' a line being executed?

(I know this question is not valid for the 2nd rule, 2nd condition as the first condition will be true; Maybe that is what you mean with the terminology remark?)

It is actually quite interesting stuff. I've also been looking in to the cookie option and can do something like RewriteCond %{HTTP_COOKIE} servselect=1 (and just set cookie using php on the site)

But first I want to solve unnecessary lookup in the first rule (if I haven't already done that) then I'll have a look if I can solve the 2nd rule possible unnecessary look ups.

Thanks!

jdMorgan

12:09 am on Oct 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If the RewriteRule pattern does not match, then no RewriteConds are processed.

If a RewriteCond evaluates as false (taking into account the negation operator "!" if present), then no further RewriteConds will be processed and the RewriteRule will not be invoked, unless that RewriteCond has an [OR] flag on it, and is followed by a RewriteCond which does evaluate as true.

ErrorDocument is processed by the Apache core, and its syntax is completely different from that of mod_rewrite. The only valid ErrorDocument targets are fixed local filepaths (e.g. "/error-pages/404-page.html") or URLs (e.g. "http://example.com/error-pages/404-error.php). This function does not support variables.

Understand that in most cases, Apache modules were written and contributed by separate authors (or teams of authors) and each uses the simplest, most-efficient syntax appropriate to the job it is intended to do. "Apache" is not a flexible scripting language, and there are no 'standards' among modules.

Note that if a URL is used as an errordocument, the server will return 302-Redirect status to the client, and not a 404-Not Found response. This can completely trash your search engine rankings and although I don't like to be bossy, I will just say "Do not use a URL in your ErrorDocument directives" until I have time to write a book describing all of the problems it can cause....

Please find, read, and be sure you understand the Apache documentation for each directive you wish to use; If you don't do this today, then you will very likely regret it tomorrow (or more likely, next year), because these are server configuration directives, and even tiny errors in server configuration can make huge problems in search engine rankings and site usability.

Jim

Salami1_1

12:45 am on Oct 21, 2009 (gmt 0)

10+ Year Member



Hi Jim,

Thanks a lot for all that info did not know that at all. I think I started using full urls because sometimes when in a subdir someone invoked a 404 it did actually not even found the 404.php document but I will def stop doing that in that case. I'll also will have a closer look to Apache documentation will prob be very handy indeed for future and current purposes.

Maybe u should consider to make this thread a sticky one or something.. I think your answers with my questions are quite useful for a lot of webmasters that have to deal with this stuff and aren't yet fully up-to-speed.

Anyway thanks a lot again!

Best regards,

jdMorgan

1:41 am on Oct 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To fix the Not Found error on the 404 Error page itself, simply start the error page's URL-path with a slash to make it relative to the server root:

ErrorDocument /path-to-404-page.html

All objects included on that page (e.g. images) should also use server-relative or canonical URLs.

Jim