Forum Moderators: phranque
I am trying to set up multiple domains on my host, each domain's files being in a subfolder. But I don't want direct access to those subfolders to work. For now, the htaccess file at the root works, but I'm still able to get to the files at the subfolder, and that's what I cannot get to work without running into the "too many redirects" problem.
The Setup
www.site1.com has its files located at www.site1.com/site1/. URLs pointing to www.site1.com/page.php are successfully rewritten to point to www.site1.com/site1/page.php with the URL NOT showing the site1 folder - this is correct.
Problem
Files located at www.site1.com/site1/page.php can still be accessed like that. I would like www.site1.com/site1/page.php to be redirected (the URL to change) to www.site1.com/page.php, but the code I've tried draws me to an infinite loop of redirects.
The code I tried was that a few posts below by Jim, with the 'example.com' site - that works in my root to successfully point things to the subfolder, but I don't know how to point things away from the subfolder.
I am a relative n00b at htaccess/rewrite - I've read a LOT about it but I don't fully understand it yet. Thanks for your help, everybody!
It also makes fixing your current problem easy, in that any request for this common name can be detected and externally redirected back to the correct domain root.
So the whole "package" is a domain to sites-subdirectory-path internal rewrite, and a client-request-for-sites-subdirectory-URL-path back to site-root-domain-URL-path external redirect.
The only real trick is preventing an infinite rewrite-redirect loop. This can be done in several ways, but one method is to examine the server variable %{THE_REQUEST}. Only in the case where this variable indicates that the request for /sites/x is coming direct from the client, and is not the result of previously executing your internal rewrite, do you want to redirect.
You will probably also want to remove FQDN-format trailing periods and port numbers. Otherwise, the requests will be rewritten to non-existent subdirectories. I also strongly suggest that you standardize on www or non-www hostnames, unless you want to support two different subdirectories per domain, one for www and another for non-www. If you do standardize, then enforce that standardization by 301-redirecting non-canonical hostname requests to the canonical domains.
The following code maps arbitrary hostnames to same-named subdirectories of the /sites/ subdirectory, and redirects direct client requests for that subdirectory back to the appropriate domain.
# Redirect direct client requests for /sites/ subdirectories back to domain
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /sites/[^/\ ]+/[^\ ]*\ HTTP/
RewriteRule ^sites/([^/]+/.+)$ http://$1 [R=301,L]
#
# Redirect to remove trailing period from FQDN-format hostnames and remove port numbers if present
RewriteCond %{HTTP_HOST} ^([^.:]+(\.[^.:]+)+)(\.¦\.?:[0-9]+)$
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
#
# Rewrite hostname requests to appropriate /sites/ subdirectory
RewriteCond $1 !^sites/
RewriteCond %{HTTP_HOST} ^([^.:]+(\.[^.:]+)+)$
RewriteRule ^(.*)$ /sites/%1/$1 [L]
Jim
[edited by: jdMorgan at 1:52 am (utc) on Aug. 29, 2009]
Can you help me understand what the rules do? I know a little bit about regular expressions but these are a little confusing.
1. If THE_REQUEST is an actual http request from the client (and not part of the loop), prefixed by anything alpha plus sites plus (things I assume are part of the full request path), then rewrite it to the sites/[domain name] and treat the domain name as the redirect.
2. I understand the concept of this one, I don't get the execution. If the HTTP_HOST contains dots and port numbers.. then do what? Why %1 and then $1?
3. If (what is $1 here?) does not start with sites and if the host has periods and ports.... then... send the request to sites/[domain name without the ports]/fileinput
Once I set that up in the root, do I need anything in the subfolders to block access? May I set up other rules in the subfolders, say, when needing to redirect a certain old filename to new filename, and set error documents?
Because the RewriteRule pattern must match before any RewriteConds are executed, $1 in the thord rule above is the localized URL-path examined by the RewriteRule.
THE_REQUEST is the entire request line as recieved from the client, and is exactly what you see logged in your raw server access log, e.g.
GET /sites/site1/page.html HTTP/1.1
Any period or port number appended to the HTTP_HOST gets dropped, although there was a bug in that rule (now fixed to prevent propagation of bad code).
.htaccess is a per-directory config file, and you may use as many as you like.
Parentheses in regular-expressions patterns can be nested. To determine back-reference numbers (i.e. $1-$9 or %1-%9), count left parentheses.
Jim
I am still confused. :/
I don't particularly want to do the "sites/" directory right now. At least, not until I'm comfortable with it. I don't mind, for now, writing two sets of these rules while I learn it. But I don't know what to change.
I have two problems.
1. /site1/ access still works
I can still get to the site while typing in site1.com/site1/(*). I want it to redirect any user-generated requests back up to site1.com/$1, but when I do that, I get an infinite loop.
2. local site redirects don't redirect properly
If I go to site1.com/page.html, it redirects me to site1.com/site1/page.php. Obviously, I do not want this to happen. :/ The similar thing is happening with my 404 page - it points to site1.com/site1/missing.php instead of just site1.com/missing.php
This is what I have for the htaccess file in ROOT.
Options +FollowSymLinks
RewriteEngine on
RewriteBase /
##try with something from webmasterworld forums##
# Externally redirect direct client requests (only) for URL
# <any-domain.com>/example/<anything> to URL www.example.com/<anything>
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /site1/[^\ ]+\ HTTP/
RewriteRule ^site1/(.*)$ http://www.site1.org/$1 [R=301,L]
#
# Externally redirect any requested hostname which contains "example.com" but is
# not *exactly* "www.example.com/<anything>" to URL www.example.com/<anything>
RewriteCond %{HTTP_HOST} site1\.org [NC]
RewriteCond %{HTTP_HOST} !^www\.site1\.org$
RewriteRule ^(.*)$ http://www.site1.org/$1 [R=301,L]
#
# Internally rewrite add-on domain requests to subdirectories
RewriteCond %{HTTP_HOST} ^www\.site1.org$
RewriteCond %{REQUEST_URI} !^/site1/
RewriteRule ^(.*)$ /site1/$1 [L]
This is what I have for the htaccess file in SITE1 subdirectory.
Options +FollowSymLinks
RewriteEngine on
RewriteBase /site1/
RewriteRule ^duedates\.html$ calendar.php [R=301,L]
RewriteRule (.*)\.html $1.php [R=301,L]
ErrorDocument 404 /missing.php
[edited by: jdMorgan at 4:20 pm (utc) on Sep. 8, 2009]
[edit reason] de-linked domains [/edit]
# Internally rewrite requests for site1 subomain to /site1 subdirectory unless already done
RewriteCond %{HTTP_HOST} ^www\.site1\.org$
RewriteCond $1 !^site1/
RewriteRule ^(.*)$ /site1/$1 [L]
Jim
RewriteCond %{HTTP_HOST} ^(www\.)?site1.org$
....
So that problem is done. Now I want to conquer the direct access problem of site1.org/site1/(.*) - I had this working at one point and I forget what code I used.
The question I also have is - to prevent direct access, do I put it in the ROOT htaccess file or in the "site1" folder?
The RewriteCond host site1.org RewriteRule /site1/ works fine. The code to force www works fine.
I added on a second domain. In THAT subfolder, I have the following code.
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(www.)?site1.org$ [NC]
RewriteCond %{REQUEST_URI} ^/site2/(.*)$
RewriteRule (.*) / [R=301,L]
This sends anything of site1.org/site2/ back to site1.org, no questions asked. :)
However, if I put the SAME CODE in the /site1/ folder, it puts me in an infinite loop and crashes.
This is what crashes:
RewriteCond %{HTTP_HOST} ^(www\.)?site1.org$ [NC]
RewriteCond %{REQUEST_URI} ^/site1/(.*)$
RewriteRule ^(.*)$ http://www.site1.org/$1 [R=301,L]
I think it crashes because, in that local site1/htaccess file, I set the RewriteBase to be /site1/. This is so the local redirects (/missing.php, *.html to *.php, etc) will work. But that might be why the rewrite gets into an infinite loop - even though I send it a complete, external redirect with the last rule? [R=301,L]
When using [R=301,L], I strongly suggest that you provide a full URL.
Jim