Forum Moderators: phranque

Message Too Old, No Replies

Subdomain Redirects work, but not with wildcard SSL

Wild Card SSL and mod_rewrite redirects

         

anawaz

10:44 pm on Apr 1, 2010 (gmt 0)

10+ Year Member



OK Folks, I've got a bit of a hairy problem, so I'm going to lay it out. Basically, I'm trying to accomplish what 37 signals does with their SaaS applications, be it basecamp or whatever.

Here is my scenario, followed by what I have in my .htaccess file.

On my server, we've got ONE domain only. EXAMPLE.COM. There are 3 subdomains configured on example.com, let's call them subdomain1, subdomain2, and subdomain3. These reside in example.com/subdomain1, example.com/subdomain2 and example.com/subdomain3. I also have wildcard DNS setup, so anything that is not subdomain1, subdomain2, or subdomain 3, but is passed as subx.example.com simply gets passed to example.com/handler/handle.php?sd=subx which then executes a script, etc. based on the variable passed as the subdomain. Here is the .htaccess code I have for this and it works flawlessly:


# Extract the subdomain part of example.com
RewriteCond %{HTTP_HOST} ^([^\.]+)\.example\.com$ [NC]

# Check that the subdomain part is not www and ftp and mail
RewriteCond %1 !^(www|subdomain1|subdomain2|subdomain3)$ [NC]

# Redirect all requests to a php script passing as argument the subdomain
RewriteRule ^.*$ http://www.example.com/handler/handle.php?sd=%1 [P,L]


I was absolutely delighted that I got this to work, which meant that every time we added a new user, they can simply access their account using their 'subdomain'. But it got a little complicated after I installed a Wildcard SSL certificate, and now if I do [subdomain1.example.com,...] [subdomain2.example.com...] or [subdomain3.example.com,...] I land on example.com but it ends up in the right place if I do [subx.example.com....]

Now I reckon I can convert all the subdomains to https too if I added the following code to what I have above:

# Extract the subdomain part of example.com on SSL and otherwise
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{HTTP_HOST} ^([^\.]+)\.example\.com$ [NC]


I don't think that will work but if I go at it for an hour or 2 I'm sure I can come up with something that will. But what I need is a big more complex and beyond my mod_rewrite skillset realm.

Basically, if I end up in the folders payment or account in ANY of the subdomains, be it subdomain1.example.com/payment or subx.example.com/account, I want the URL to automatically change to https. Now I would typically put a .htaccess file in each of the folders where I directly want the the URL to change to https and add the following code:
RewriteCond %{SERVER_PORT} 80
RewriteCond %{REQUEST_URI} payment
RewriteRule ^(.*)$ https://www.example.com/payment/$1 [R,L]


Now I can use this in the case of subdomain1, subdomain2 and subdomain3, but how on earth would I get this to work with all those subdomains I am passing dynamically!?

Does anyone have any ideas? I know the idea here is to help correct the code and not write it, but the problem is I'm not sure where to start, because as it stands right now, my subdomain1.subdomain1.example.com resolves to (not redirects, but resolves) [example.com...] and I haven't quite worked that out yet either.

Thank you!

[edited by: jdMorgan at 12:05 am (utc) on Apr 2, 2010]
[edit reason] example.com [/edit]

g1smd

11:31 pm on Apr 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In all of this how does a request for example.com/robots.txt or for any searchengine verification files get serviced? Does your script return the file contents?

It's important to fully define the requirements before starting any coding.

jdMorgan

12:21 am on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The "perform proxy through-put to script" is basically a mistake, since it invokes an additional HTTP request *from* this server to... itself? That's rather wasteful of server resources.

If this script is local --in this same filespace-- would suggest changing the rewriterule syntax from an external reverse-proxy-through-put to an internal rewrite by removing the [P] flag and specifying the local script filepath (relative to DocumentRoot) rather than a URL.

I would suspect that your subdomain1, subdomain2, and subdomain3 are being 'handled' specially somehow, but I don't see the code. The fact that you needed to exclude them from one of your rules is a further indication that this code exists... but it's a no-show here. I doubt you'll get an answer to that question until that's issue is cleared up.

The SSL/non-SSL issue should really wait until your primary "subdomain" problem is cleared up. It's a simple matter of redirecting "account" and "payment" requests to HTTPS if they're made using HTTP, and redirecting requests for all other 'page' URLs to HTTP if they're made using HTTPS. Note that images, CSS, and JS files --objects which may be 'shared' between SSL and non-SSL pages-- should be exempted from these redirects, and that further the linking from non-SSL pages to SSL pages and vice-versa should consistently specify the correct protocol for the target pages: Do not rely on redirection to 'correct' your on-page linking or your search listings and rankings will suffer, and your 'user experience' will be "non-optimal" (a somewhat excessively-polite term for what I'm thinking...)

Let's take one issue at a time here, so as not to go around and around in a confused state. How are the 'pre-defined' subdomains resolved to file paths in your current set-up? Do you have additional rules to do this, or did you 'configure them' with a Control Panel, or what?

Jim

anawaz

8:32 am on Apr 2, 2010 (gmt 0)

10+ Year Member



Hi Jim, g1smd, thanks for your responses.

I haven't even arrived at the robots.txt issue yet, I was just trying to get things to work.

subdomain1, subdomain2 and subdomain3 are actually setup using cPanel. So there are entries in my httpd.conf that specify what needs to be done with those subdomains before there is a wildcard entry and an entry for SSL in httpd.conf.

Jim, I'm with you on controlling the linking from SSL to non SSL pages should be a result of redirects. The images too, I can deal with inside my php code since I know which pages need to serve https and which need to serve http. So for all practical purposes, when ever somebody has to go into the payment folder, any script that takes them there will always serve https, we would just need to control this via http access in case somebody adventurous wants to see what happens if they remove the s.

It may help to know that I'm running a dedicated box with cPanel simply because cPanel makes PCI compliance and managing updates, etc. on the server much simpler. Besides, my command line skills, well, negligible and I didn't want to go back our data server admins with configuration changes all the time.

The fact is, Jim, that ever since I did implement this code, the page load time has tripled! I suspect that is because of wasteful use of resources, so thanks for pointing that out too!

I hope this helps and will look for some enlightenment!

anawaz

8:40 am on Apr 2, 2010 (gmt 0)

10+ Year Member



There is perhaps another workaround we could use (since we are still developing) if there are too manys search engine ramifications for generating submdomains for each customer. We want each customer's account to get listed in search engines and if were to have a discussion solely on that, would people accounts be better ranked if they were sitting on domain.com/customer or customer.domain.com? Or is that irrelevant?

jdMorgan

1:56 pm on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The position of the customer name in the URL is practically irrelevant with regard to search ranking, since other factors such as on-page titles and descriptions and the relevance of phrases appearing in on-page text and as link-text in inbound links *to* these pages is far more important. The subdomain approach seems more attractive to me based solely on 'branding' considerations. Also, it reveals far less about the file structure that you use to implement these sites, and that appeals more to me for several reasons, both aesthetic and technical.

Any subdomains declared with the typical cPanel set-up will be 'invisible' to this mod_rewrite code. Because of the way that cPanel 'creates' subfolders for "add-on" (sub)domains, requests for those (sub)domains are rewritten by the cPanel-created code directly to their corresponding subfolders, and by-pass any code in your "main domain's" root .htaccess file.

Therefore, any access controls, canonicalization redirects, and other 'special handling' that you want to apply to these cPanel subdomains will need to be duplicated in these subdomains' own .htaccess files. In addition, you will likely have problems in trying to 'share' server-side scripts and other objects between the cPanel subdomains and main domain and other subdomains -- As far as cPanel subdomains are concerned, they exist in their own 'private' filespace, and cannot reference anything about their own 'home directory' unless you declare Aliases and ScriptAliases in the server configuration to support this. Otherwise, these scripts and other resources will also have to be duplicated within each cPanel-created subdomain's filespace.

In general, I find it far easier to ditch cPanel for multiple domain/subdomain servers, and instead go with an IP-address-based virtual server using domain-and-subdomain-to-subfolder rewriting code of my own design. Many hosts provide this service level for about one dollar more per month over 'standard' name-based shared hosting, and call it a "unique IP address" or "private IP address" or some-such thing.

The key is that it is an IP-address-based virtual server as opposed to a name-based virtual server, and therefore, you can point absolutely any domain name or subdomain to the server using DNS -- The server configuration (e.g. cPanel) does not need to 'know' any of the hostnames, and *all* hostname requests arriving at the server's IP address will 'land' in the top-level .htaccess file, allowing you to do anything you like with them. Also, the issues of sharing scripts and other objects between the domains/subdomains goes away; the URL-to-filespace mapping is entirely within your control, and all in one spot.

Once you've "crossed the line" from trivial cPanel configurations into domain/subdomain rewriting (as you have already done here), this IP-based virtual server setup is *much* simpler to configure, use, and maintain (admittedly just my opinion).

Back on track, the general form for HTTPS<->HTTP redirection is something like this:

# Redirect HTTP requests for secure pages to HTTPS
RewriteCond %{SERVER_PORT} !=443
RewriteCond %{HTTP_HOST} .
RewriteRule ^(secure-URL-path1|secure-URL-path2|secure-URL-path3)$ https://%{HTTP_HOST}/$1 [R=301,L]
#
# Redirect HTTPS requests for non-secure pages to HTTP
RewriteCond %{SERVER_PORT} =443
RewriteCond %{HTTP_HOST} .
RewriteCond $1 !^\.(gif|jpe?g|png|ico|css|js|etc-etc)$
RewriteCond $1 !^(secure-URL-path1|secure-URL-path2|secure-URL-path3)$
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1 [R=301,L]

Note the check for "HTTP_HOST not blank" -- This is not required on name-based virtual servers, but is critical on IP-based virtual servers. If there's a possibility that you may eventually move from the former to the latter, then include it as 'inexpensive future-proofing'. It prevents infinite redirection loops and/or invalid redirects if a request arrives at an IP-based server which does not contain an HTTP Host header, leaving HTTP_Host blank.

Also note that 'shared objects' such as images, css files, and JavaScript files are excluded from the HTTPS-to-HTTP redirect, so that they *can* be 'shared' between secure and non-secure pages without throwing 'mixed secure/insecure content' warnings in browsers. The list shown in the RewriteCond pattern is just an example, which you will likely need to adjust and/or expand. The idea is implicit in the use of the word "pages" in each rule's comment line.

I've shown the "secure path" patterns as fully-anchored; These may also need to be adjusted by removing the end anchors or by adding wild-cards to the ends, depending on your exact needs.

Jim

anawaz

6:42 pm on Apr 2, 2010 (gmt 0)

10+ Year Member



Thanks Jim, I'm with you (for the most part, I think).

Coming back to g1smd's question, though, if I setup subdomains the way I have, I really have no way of search engines accessing the robots.txt file, for instance.

If accounts are added dynamically and subdomains then become accessible, I could pass the robots information directly into the handler.php file, but that doesn't seem like it would be the best way to do it, assuming we stick with cPanel for now?

g1smd

6:54 pm on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You could rewrite URL requests for /robots.txt to some other known location on the server and place the files there.

anawaz

7:02 pm on Apr 2, 2010 (gmt 0)

10+ Year Member



Agreed, but I was hoping for a classier solution. lol. Thanks, that's what I'll do.

anawaz

5:15 pm on Jun 4, 2010 (gmt 0)

10+ Year Member



OK, I'm going to open this up again, this time with a slightly more devious problem. Ignore the fact that I'm using (.*) for now - I will deal with that later.

For now, I've got subdomains that are working nicely.

This is my htaccess


# Extract the subdomain part of example.com
RewriteCond %{HTTP_HOST} ^([^\.]+)\.example\.com$ [NC]

# Check that the subdomain part is not www and ftp and mail
RewriteCond %1 !^(www|ftp|mail|webmail)$ [NC]

# Redirect all requests to a php script passing as argument the subdomain
RewriteRule ^(.*)/(.*)$ http://example.com/user/$1.php?subdomain=%1&id=$2 [P,L]


The above comment by jdMorgan is duely noted about redirecting using the domain and I have the code to deal with that too. Now this code works great, whether I use ONE variable or 2 variables passed after the subdomain.

So, if I enter test.example.com/contact/ I get redirected to example.com/user/contact.php?subdomain=test.

and if I enter text.example.com/items/21 I get redirected to example.com/user/items.php?subdomain=test&id=21 which is also working fine.

However, if I just type in test.example.com or test.example.com/ I end up on example.com when I want to go to example.com/user/index.php?subdomain=test.

I've tried this for an empty variable to add after my last line in htaccess:

RewriteRule ^$ http://example.com/user/index.php?subdomain=%1 [P,L]


But I've had no luck. How would I tell it to do that if no variables are declared in the URL?

Thanks again for all the help!

jdMorgan

5:51 pm on Jun 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You need three rules, one for each case of 'variable present in URL' and 'variable not present in URL.'

For simplicity, you may just wish to duplicate the rule you have two more times, and adjust it, since you'll need to duplicate the rule's RewriteConds as well.

Jim