Forum Moderators: phranque

Message Too Old, No Replies

subdomains with htaccess

mod_rewrite htaccess subdomain

         

keuluu

10:58 am on Apr 14, 2009 (gmt 0)

10+ Year Member



Hi,

I'm facing a problem with URL rewriting subdomain techniques.
My goal is to access two parts of an application (public and admin) located in two subdirectories (pages/public and pages/admin/) with www.domain.com heading to pages/public and webadmin.domain.com heading to pages/admin/.

My hosting service oriented me towards :


RewriteEngine On
RewriteCond %{REQUEST_URI} !/pages/public
RewriteRule (.*) /pages/public/$1 [QSA,L]

to redirect public part to pages/public/ which works fine but seems to interfere with any other rule.

I tried to add rules for admin part but I could not find my way through. Last attempts left me with infinite looping or unreachable destination :


RewriteEngine On
RewriteCond %{REQUEST_URI} !/pages/public
RewriteCond %{REQUEST_URI} !/pages/admin
RewriteCond %{HTTP_HOST} !^webadmin\.domain\.com
RewriteCond $1 !^www/
RewriteRule (.*) /pages/public/$1 [QSA,L]

RewriteCond %{HTTP_HOST} ^webadmin\.domain\.com
RewriteCond $1 !^webadmin/
RewriteRule (.*) http://www.domain.com/pages/admin/$1 [QSA,S,L]

Thanks for any help.

jdMorgan

2:37 pm on Apr 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



[Rewrite requests for] www.example.com/ to /pages/public/ and [rewrite requests for] webadmin.example.com/ to /pages/admin/


RewriteEngine on
#
# Rewrite requests for www.example.com/xyz to /pages/public/xyz
RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^(.*)$ /pages/public/$1 [L]
#
# Rewrite requests for webadmin.example.com/xyz to /pages/admin/xyz
RewriteCond %{HTTP_HOST} ^webadmin\.example\.com
RewriteCond $1 !^pages/admin/
RewriteRule ^(.*)$ /pages/admin/$1 [L]

In both rules, the second RewriteCond is a "loop stopper" which prevents requests from being repeatedly rewritten, adding "pages/public/" or "/pages/admin/" to the requested path recursively.

Once you get this working, you should add rules so that direct access to /pages/admin and /pages/public is ot possible. These requests should be redirected back to the root of the correct subdomain. Don't worry about that until you get the two rules above working.

Jim

keuluu

3:11 pm on Apr 14, 2009 (gmt 0)

10+ Year Member



Thanks a lot,

I had this working in the meantime :


RewriteCond %{HTTP_HOST} ^webadmin.example.com$
RewriteCond %{REQUEST_URI} !^/pages/admin
RewriteRule ^(.*)$ pages/admin/$1 [L]

RewriteCond %{HTTP_HOST} ^www.example.com$
RewriteCond %{REQUEST_URI} !^/pages/public
RewriteRule ^(.*)$ pages/public/$1 [L]

which seems to make the job the same.
Is it correct ?

I also could force www.example.com instead of example.com with :


RewriteCond %{HTTP_HOST} ^example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

Is this the correct approach ?
Is R=301 necessary/recommended ?
And what would be the solution to restrict access to public and admin subdirectories ?

thanks again.

jdMorgan

5:09 pm on Apr 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



1) Escape the literal periods in regex patterns as shown in my example. Otherwise, a period in a pattern means "match any single characters."

2) Do not end-anchor your positive-match hostnames with "$", or your rules will be defeated by valid but "non-standard" requests like FQDN requests or requests with a port number appended, e.g. www.example.com./page or www.example.com:80/page or even www.example.com.:80/page.

3) For the same reason, I suggest improving your hostname canonicalization by using two rules:


# Canonicalize variant webadmin hostname requests (e.g. www.webadmin.www.example.com.:80)
RewriteCond %{HTTP_HOST} ^([^.]*\.)*webadmin\.([^.]*\.)*example\.com
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://webadmin.example.com/$1 [R=301,L]
#
# Canonicalize all other variant hostname requests (non-www or bogus subdomains)
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

These two rules redirect to the canonical hostname unless the requested hostname is *exactly* admin.example.com or www.example.com respectively. Rule order is important here.

4) Put the domain canonicalization redirect(s) before the internal rewrites, to avoid having an external redirect expose the internal filepath to the clients (browsers, robots, etc.)

A comment: In regular expressions and mod_rewrite, every single character matters. Anything less than absolutely-perfectly-correct can and will likely have unintended side-effects. The code is typically small -- but it's very powerful; One single typo can (if you are lucky) take down your server immediately. If you are unlucky, a little typo or omission can slowly erode your search engine rankings over time -- or open up your site to malicious exploitation. For this reason, every character I typed above was intentional and justified.

Jim

[edited by: jdMorgan at 5:21 pm (utc) on April 14, 2009]

jdMorgan

5:19 pm on Apr 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Forgot to answer your last question:

# Canonicalize direct client requests for admin subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /admin(/[^\ ]*)?\ HTTP/
RewriteRule ^admin(/.*)?$ http://webadmin.example.com$1 [R=301,L]
#
# Canonicalize direct client requests for public subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /public(/[^\ ]*)?\ HTTP/
RewriteRule ^public(/.*)?$ http://www.example.com$1 [R=301,L]

Put these two rules ahead of the hostname canonicalization rules.

Checking THE_REQUEST (the HTTP request as received from the client) is mandatory in order to avoid interaction with your existing subdirectory rewrites. Without this check, the two sets of rules would interact, causing an 'infinite' rewrite/redirect loop.

Jim

[edited by: jdMorgan at 5:22 pm (utc) on April 14, 2009]

keuluu

5:54 pm on Apr 14, 2009 (gmt 0)

10+ Year Member



Thanks again for your explainations and time...

So my final set of rules look like this :


# Canonicalize direct client requests for admin subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /admin(/[^\ ]*)?\ HTTP/
RewriteRule ^admin(/.*)?$ http://webadmin.example.com/$1 [R=301,L]

# Canonicalize direct client requests for public subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /public(/[^\ ]*)?\ HTTP/
RewriteRule ^public(/.*)?$ http://www.example.com/$1 [R=301,L]

# Force example.com to www.example.com
RewriteCond %{HTTP_HOST} ^example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite requests for webadmin.example.com/xyz to /pages/admin/xyz
RewriteCond %{HTTP_HOST} ^webadmin\.example\.com
RewriteCond $1 !^pages/admin/
RewriteRule ^(.*)$ /pages/admin/$1 [L]

# Canonicalize variant webadmin hostname requests (e.g. www.webadmin.www.example.com.:80)
RewriteCond %{HTTP_HOST} ^([^.]*\.)*webadmin\.([^.]*\.)*example\.com
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://webadmin.example.com/$1 [R=301,L]

# Canonicalize all other variant hostname requests (non-www or bogus subdomains)
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

but I can still call www.example.com/pages/public/ and webadmin.example.com/pages/admin/ with no change...
And wrong.example.com isn't forced to www.example.com either (that's what rule nbr 6 is supposed to handle isn't it ?)
maybe I'm wrong with rule priorities...

jdMorgan

6:14 pm on Apr 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Delete rule #3. It is no longer needed, since the final two rules (in your post) replace it.

Move the final two (external redirect) rules ahead of the /pages/admin (internal rewrite) rule.

What happened to the /pages/public internal rewrite rule? It's gone...

Remember, all external redirects (with a protocol and domain name in the RewriteRule substitution and/or a [R=301] flag on the rule) go first, followed by all internal rewrites. Within those two groups put your rules in order from most-specific conditions and regex patterns (fewest URL-paths affected) to least specific pattern (most URL-paths affected). This will prevent unexpected results and "stacked" or "chained" multiple redirects.

Jim

keuluu

8:53 pm on Apr 14, 2009 (gmt 0)

10+ Year Member



OK so it finally looks like :

# Canonicalize variant webadmin hostname requests (e.g. www.webadmin.www.example.com.:80)
RewriteCond %{HTTP_HOST} ^([^.]*\.)*webadmin\.([^.]*\.)*example\.com
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://webadmin.example.com/$1 [R=301,L]

# Canonicalize all other variant hostname requests (non-www or bogus subdomains)
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Canonicalize direct client requests for admin subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /admin(/[^\ ]*)?\ HTTP/
RewriteRule ^admin(/.*)?$ http://webadmin.example.com/$1 [R=301,L]

# Canonicalize direct client requests for public subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /public(/[^\ ]*)?\ HTTP/
RewriteRule ^public(/.*)?$ http://www.example.com/$1 [R=301,L]

# Rewrite requests for webadmin.example.com/xyz to /pages/admin/xyz
RewriteCond %{HTTP_HOST} ^webadmin\.example\.com
RewriteCond $1 !^pages/admin/
RewriteRule ^(.*)$ /pages/admin/$1 [L]

# Rewrite requests for www.example.com/xyz to /pages/public/xyz
RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^(.*)$ /pages/public/$1 [L]

but I can't get either wrong.example.com to be transformed into www.example.com nor pages/public and pages/admin/ to be restricted....

Anyway, my last question would be is it possible to have php page and GET variable relooking such as :


RewriteRule ^/projects/([0-9]+)/?$ index.php?v=projects&type=$1
RewriteRule ^/projects/?$ index.php?v=projects [L,QSA]

(type being a number)

These last rules look good to me but they won't work.
Would earlier rules prevent these from working good ?
According to your tips, I should put them at the bottom of all rules, is it correct ?

Thanks again Jim

jdMorgan

1:59 am on Apr 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> but I can't get either wrong.example.com to be transformed into www.example.com
wrong.example.com must be defined in your DNS zone file -- either explicitly or by wild-card subdomain. If the domain isn't defined, then the requests won't even be sent to your server.

> nor pages/public and pages/admin/ to be restricted....
Be sure to completely flush your browser cache before testing any changes to any server-side code.

Your new rules won't work for two reasons: First, URL-path patterns in RewriteRules in .htaccess files will never start with a slash. Second, the rules above your new rules will rewrite the URLs before your last two rules are processed. Your new rules will need to do everything that two the previous internal rewriterules do, plus the new URL-to-query-string rewrite. In other words rewrite projects/<numbers>/ directly to /pages/public/index.php?v=projects&type=<numbers> all at once. Then put these two new rules ahead of the preceding two internal rewrite rules.

Jim

keuluu

8:03 am on Apr 15, 2009 (gmt 0)

10+ Year Member



something like :

RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^projects/([0-9]+)/?([0-9]+)/?$ /pages/public/index.php?v=projects&type=$1&id=$2 [L,QSA]
RewriteRule ^projects/([0-9]+)/?$ /pages/public/index.php?v=projects&type=$1 [L,QSA]
RewriteRule ^projects/?$ /pages/index.php?v=projects [L,QSA]
RewriteRule ^(.*)$ /pages/public/$1 [L,QSA]

or do I have to repeat the rewriteCond statement before each rewriteRule ?

keuluu

8:20 am on Apr 15, 2009 (gmt 0)

10+ Year Member



It works !
The code finally looks like :

# Canonicalize variant webadmin hostname requests (e.g. www.webadmin.www.example.com.:80)
RewriteCond %{HTTP_HOST} ^([^.]*\.)*webadmin\.([^.]*\.)*example\.com
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://webadmin.example.com/$1 [R=301,L]

# Canonicalize all other variant hostname requests (non-www or bogus subdomains)
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond %{HTTP_HOST} !^webadmin\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Canonicalize direct client requests for admin subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /admin(/[^\ ]*)?\ HTTP/
RewriteRule ^admin(/.*)?$ http://webadmin.example.com/$1 [R=301,L]

# Canonicalize direct client requests for public subdirectory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /public(/[^\ ]*)?\ HTTP/
RewriteRule ^public(/.*)?$ http://www.example.com/$1 [R=301,L]

# Rewrite requests for webadmin.example.com/xyz to /pages/admin/xyz
RewriteCond %{HTTP_HOST} ^webadmin\.example\.com
RewriteCond $1 !^pages/admin/
RewriteRule ^(.*)$ /pages/admin/$1 [L]

# Rewrite requests for www.example.com/xyz to /pages/public/xyz and transform query string to /a/b/c/
RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^projects/([0-9]+)/?([0-9]+)/?$ /pages/public/index.php?v=projects&type=$1&id=$2 [L,QSA]

RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^projects/([0-9]+)/?$ /pages/public/index.php?v=projects&type=$1 [L,QSA]

RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^projects/?$ /pages/index.php?v=projects [L,QSA]

RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^(.*)$ /pages/public/$1 [L,QSA]

I still can access pages/public/ and pages/admin/, even with cache disabled, but the main thing works good.

Thanks again a lot for your help Jim.

keuluu

8:24 am on Apr 15, 2009 (gmt 0)

10+ Year Member



I need to understand...
> wrong.example.com must be defined in your DNS zone file -- either explicitly or by wild-card subdomain. If the domain isn't defined, then the requests won't even be sent to your server.

OK, but what is the purpose of rule nbr 2 then ? (non-www or bogus subdomains). Isn't it supposed to handle such a case anyway ?

jdMorgan

1:18 pm on Apr 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The DNS zone file must define any and all domains and subdomains associated with your server. If a domain is not defined in the DNS zone file, then the browser will not be able to look up the IP address of your server, and will fail before sending the HTTP request. Therefore, you can say that "that domain does not exist" if there is no A record or CNAME matching that domain in DNS.

Also, this comment is backward and inaccurate. While it may not seem important, it may confuse others who look at your code in the future:

 # Rewrite requests for www.example.com/xyz to /pages/public/xyz and transform query string to /a/b/c/ 

This should read:

 # Rewrite requests for www.example.com/projects/a/b/c/ to /pages/public/index.php?v=project&type=a&id=b/c,
# rewrite requests for www.example.com/projects/a/b/ to /pages/public/index.php?v=project&type=a&id=b, and
# rewrite requests for www.example.com/projects/[b]abc[/b]/ to /pages/public/index.php?v=project&type=[b]ab[/b]&id=[b]c[/b]

since that is what the code actually does. I did not include the variations with no trailing slash, but at least this makes the behaviour clear, and points out the possibly-unexpected behaviour in the third case.

Jim

keuluu

9:25 pm on Apr 15, 2009 (gmt 0)

10+ Year Member



I work my way out of all this...
thanks Jim

Luc

jdMorgan

9:51 pm on Apr 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Based on the fact that you have a rule for single-parameter URLs immediately following this rule, I'd suggest making the first slash non-optional, and changing the comment to reflect the change:

# Rewrite requests for www.example.com/projects/a/b/c/ to /pages/public/index.php?v=project&type=a&id=b/c,
# and rewrite requests for www.example.com/projects/a/b/ to /pages/public/index.php?v=project&type=a&id=b
RewriteCond %{HTTP_HOST} ^www\.example\.com
RewriteCond $1 !^pages/public/
RewriteRule ^projects/([0-9][b]+)/([[/b]0-9]+)/?$ /pages/public/index.php?v=projects&type=$1&id=$2 [L,QSA]

Any requested URL that has fewer than two "parameters" in it will then fall through to the next rule, and single-parameter URLs like www.example.com/projects/abc/ will then be rewritten to /pages/public/index.php?v=projects&type=abc (without an "id=" parameter)

Jim

keuluu

10:24 pm on Apr 15, 2009 (gmt 0)

10+ Year Member



Right... but what would be the benefit ?
faster processing of the regexp ?

Luc

jdMorgan

11:01 pm on Apr 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The benefit is correct operation. As you have it coded now, the rule that follows it will never run. If that does not bother you, then why did you add the second rule?

Jim

keuluu

8:47 am on Apr 16, 2009 (gmt 0)

10+ Year Member



Right as well...

luc