Forum Moderators: phranque
I couldn't find a way to indent the hierarchy using html, so I used different width arrows, sorry...
I recently switched from my home based webserver to one of the top ranked hosting companies and found that from my viewpoint the file structure they provide was bizarre.
-------------------------
Regardless of what the domain names happen to be, a workable hierarchy might look something like this :
/some_path_from_root_or_from_home
----> /some_vh_container
--------> /virtual_host_001 (x_airplanes.com)
--------> /virtual_host_002 (x_boats.com)
--------> /virtual_host_003 (x_cars.com)
--------> /virtual_host_004 (x_motorcycles.com)
--------> /virtual_host_005 (x_shoes.com)
The hierarchy makes it simple to maintain a different directory for each domain on your workstation, and FTP becomes a one to one proposition from your workstation to the ISP.
-------------------------
As you can see below to me it is really bizarre to put all the domains added after the first one, nested inside of the first one (this causes a pain for FTP). A much better solution would have been to add one additional folder, which would simplify things considerably.
One can see below how messy things can get if x_airplanes (the first vh) has a lot of folders, and files within it, the FPT problem is a pain, and not only that, this is just plain UGLY...
/home1
----> /public_html (this is the document root for x_airplanes)
--------> index.html
--------> document 1
--------> ... of course these documents do not sort in such an organized way
--------> document n
--------> folder 1
--------> ... these folders are interspersed between the virtual hosts below
--------> folder n
--------> /x_boats
--------> /x_cars
--------> /x_motorcycles
--------> /x_shoes
IMO the flaw in their design is that /public_html should not be the DocumentRoot for the first website, it should contain a virtual host directory for each site, regardless of what order it was entered. So instead of /public_html being the DocumentRoot for x_airplanes.com, it should be a container for the virtual host x_airplanes.com
------------------------------------
So instead of the layout below (ISP Generated) :
/home1
----> /public_html (document root for x_airplanes.com)
--------> /x_boats
--------> /x_cars
--------> /x_motorcycles
--------> /x_shoes
------------------------------------
/home1
----> /public_html (where this is a container for all the virtual hosts)
--------> /x_airplanes (I added a new DocRoot for this virtual host)
--------> /x_boats
--------> /x_cars
--------> /x_motorcycles
--------> /x_shoes
This setup allows me to have a standard hierarchy, and allows FTP to work correctly without having problems between the logical and physical layout. Logically all these are separate domains (virtual hosts), physically the previous setup makes the other four virtual hosts behave as though they were sub-domains, or subordinate to the first virtual host, which is wrong.
Using mod_rewrite code to solve the problem between the logical and physical layout the FTP problem appears below.
ANYWAY, here it is, this solves one problem :
/home1
----> /public_html
--------> .htaccess
# --------------------------------------------
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/x_airplanes/.*$
RewriteRule ^(.*)$ /x_airplanes/$1
# --------------------------------------------
anyway, this redirects any traffic
FROM : [x_airplanes.com...] ----> /public_html/whatever
TO : [x_airplanes.com...] ----> /public_html/x_airplanes/whatever
displays the page, but does not change the URL which is exactly what we want, this is great...
The problem becomes if the user enters :
[x_airplanes.com...]
The correct page is displayed but we don't want that long URL, so we need a second part to .htaccess
# --------------------------------------------
RewriteEngine On
# ( 1 ) --------------------------------------------
RewriteCond %{HTTP_HOST} ^x_airplanes\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.x_airplanes\.com$
RewriteCond %{REQUEST_URI} !^/x_airplanes/
RewriteRule ^(.*) /x_airplanes/$1 [L]
# ( 2 ) --------------------------------------------
RewriteCond %{HTTP_HOST} ^x_airplanes\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.x_airplanes\.com$
RewriteCond %{REQUEST_URI} ^/x_airplanes/
RewriteRule ^/x_airplanes/(.*) /$1 [R=301,L]
# --------------------------------------------
The only problem is that part ( 2 ) is not re-writing the shorter and correct URL in the browser . Instead of [R=301,L], I have tried [R], [R=301], and a variety of other things, and none will cause the browser URL to change :
FROM : [x_airplanes.com...]
TO ....: [x_airplanes.com...]
Any ideas ?
Best Regards,
Bill Hernandez
Plano, Texas
There's another problem: Had your pattern been correct, you'd likely be posting something like "Help, I've got an infinite looping problem," because your second rule countermands your first, and vice-versa *and* mod_rewrite processing in .htaccess is recursive -- If any rule is invoked, then processing is re-started to make sure that no other rules apply to the newly-rewritten path. So, your code would have looped.
Fixing that, making some corrections, and making several optimizations based on regular-expressions pattern-matching features gives:
# Externally redirect direct client requests for domain-subdirectory back to domain
RewriteCond %{HTTP_HOST} ^(www\.)?x_airplanes\.com
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /x_airplanes/
RewriteRule ^x_airplanes/(.*)$ http://www.x_airplanes.com/$1 [R=301,L]
#
# Internally rewrite x_airplanes domain requests to x_airplanes subdirectory
RewriteCond %{HTTP_HOST} ^(www\.)?x_airplanes\.com
RewriteCond %{REQUEST_URI} !^/x_airplanes/
RewriteRule ^(.*)$ /x_airplanes/$1 [L]
Putting the redirects first prevents 'exposing' internally-rewritten filepaths as URLs, while the specificity guideline prevents both multiple/stacked/chained redirects and unexpected pattern matches causing unexpected operation.
Note that these rules (and my optimization) point out another problem: The internal rewrite rule would not have to account for both "www" and "non-www" hostnames if you had proper domain canonicalization in place. But note that due to the specificity guideline, any such canonicalization rule should follow the redirect posted above, since a "whole-domain" redirect is less specific than a direct-client-request-for-particular-subdirectory redirect; If you did the domain redirect first, it would only fix the hostname, and then the fix-the-subdirectory rule would trigger and you'd get two back-to-back redirects -- Not good if you've got search engines watching...
Jim
Just north of Plano...
It's been a long time since I did any mod_rewrite, and it is not something I use often anymore, so I am very happy you spotted the problems, and provided a solution that works perfect...
I will print out your reply so I can read it over a couple of times to make sure I understand it fully...
Thanks for taking the time to point out the problem, and provide the explanation...
Thanks again for the great help...
Bill Hernandez
Plano, Texas
Last night I was thinking "how could I have forgotten so much about mod_rewrite ?", so I went to my library and dug out "The Definitive Guide to Apache mod_rewrite", and quickly realized that I had never worked with .htaccess files. I also realized that when I last worked with mod_rewrite I had access to the mod_rewrite logs, which I don't now.
For several years I have been running on my own "OS X Servers", which gave me access to all the configuration files for Apache, PHP, MySQL, PgSQL, logs, etc.
It's been nice having the servers right on the LAN with no need for FTP. Yesterday I purchased the latest "Transmit" since I will now join the FTP world...
I've decided to keep one development domain on my current "OS X Server" and host the other domains where they run all the time. I am sure there will be adjustments due to the fact that you are no longer in control of the servers.
I think from a security point of view I will worry a whole lot less by not having a server on my LAN that is exposed to the outside world.
I've been running Verizon Business FIOS 25MB/5MB service for a long time, so for the small amount of traffic that I had, it was more than ample.
A couple of years ago I got a Trojan Horse on one of my workstations, and it quickly spread to three servers, and to several workstations. It took me a couple of months from the time I began having problems until I finally figured out what was causing the problems, and resolved the problem.
I eventually destroyed the Trojan on all the machines, ended up reformatting all boot drives, re-installing the operating system, and all applications. What a pain...
I bought a dedicated SonicWall Firewall, which has worked very well. It has allowed me to keep the servers on the LAN side of the firewall, do "one to one pass-thru" NAT, which converted 5 public IP's (from Business FIOS) into 5 private IP's, and in turn allowed me to do IP based virtual hosting. The IP based virtual hosting has allowed the SSL certificates to work correctly, which was a problem when I did name based virtual hosting.
The interesting thing I've learned since I bought the firewall about three years ago is the huge number of attempts on the wired and wireless connections that it intercepts every day. It provides notifications about all kinds of hacking attempts.
The wireless attacks are mostly from all the robot infected machines in the neighborhood, and even though the firewall seems to be doing a really good job, I will be much happier letting somebody else worry about security.
Thanks very much for helping me with the htaccess solution...
Bill Hernandez
Plano, Texas