Forum Moderators: phranque
Home directory for the server is /home/user1/htdocs
Home directories for the 3 domains are:
xyz1.com: /home/user1/htdocs
xyz2.com: /home/user1/htdocs/xyz2
xyz3.com: /home/user1/htdocs/xyz3
Shared directories for the websites are icons, cgi-bin and images.
Using /home/user1/htdocs/.htaccess I manage to hide the xyz2 and xyz3 directories from the url's:
ErrorDocument 404 [xyz1.com...]
RewriteEngine on
Options +FollowSymlinks
RewriteBase/
# allow access to shared directories
RewriteCond %{REQUEST_URI} ^/icons/.* [OR]
RewriteCond %{REQUEST_URI} ^/images/.* [OR]
RewriteCond %{REQUEST_URI} ^/cgi-bin/.*
RewriteRule ^(.*)$ $1 [L]
# prevents external access to xyz2 and xyz3 directories from host ip, none or xyz1.com
RewriteCond %{REQUEST_URI} ^/xyz2/.* [OR]
RewriteCond %{REQUEST_URI} ^/xyz3/.*
RewriteCond %{HTTP_HOST} ^$ [OR]
RewriteCond %{HTTP_HOST} ^123\.123\.123\.123$ [OR]
RewriteCond %{HTTP_HOST} xyz1\.com$ [NC]
RewriteRule ^(.*)$ $1 [G]
# prevents external access to xyz2 directory from host xyz3.com
RewriteCond %{REQUEST_URI} ^/xyz2/.*
RewriteCond %{HTTP_HOST} xyz3\.com$ [NC]
RewriteRule ^(.*)$ $1 [G]
# prevents external access to xyz3 directory from host xyz2.com
RewriteCond %{REQUEST_URI} ^/xyz3/.*
RewriteCond %{HTTP_HOST} xyz2\.com$ [NC]
RewriteRule ^(.*)$ $1 [G]
# missing trailing / for directory problem
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/[R]
# redirect xyz2.com to xyz2
RewriteCond %{HTTP_HOST} xyz2\.com$ [NC]
RewriteCond %{REQUEST_URI}!^/xyz2/.*
RewriteRule ^(.*)$ /xyz2/$1 [L]
# redirect xyz3 to xyz3
RewriteCond %{REQUEST_URI}!^/xyz3/.*
RewriteCond %{HTTP_HOST} xyz3\.com$ [NC]
RewriteRule ^(.*)$ /xyz3/$1 [L]
A call to [xyz2.com...] now returns the document /home/user1/htdocs/xyz2/index.html
However, [xyz2.com...] returns the same document, resulting in possible search engine penalties for duplicating websites.
I wish to remove [xyz2.com...] as a valid url, but all my attempts so far has resulted in a rewrite looping problem:
RewriteCond %{HTTP_HOST} xyz2\.com$ [NC]
RewriteCond %{REQUEST_URI} ^/xyz2/.*
RewriteRule ^(.*)$ $1 [G]
I have tied NS, N, C, L and S but have not managed to achieve the desired effect with my attempts.
Any and all suggestions and/or hints are greatly appreciated (also regarding more efficient code on what is already working)!
Welcome to WebmasterWorld!
Yes, there's no good way to prevent it, and all you can do is to make sure there are no links to the subdirectory, and that there are no errors which would 'expose' the subdirectory.
One such problem: Your ErrorDocument directive is malformed, and will return 302 redirect status, rather than a 404-Not Found. You should use a local URL-path, not a canonical URL. This is one way your subdomain/subdirectories could be exposed. See the Apache documentation of the ErrorDocument [httpd.apache.org] directive for the proper syntax, and test your server response using the Server Headers [webmasterworld.com] checker in your WebmasterWorld Control Panel.
Another note on that subject: You should consider creating a custom 404 error page explaining that the requested page is missing or gone, and then provide a clickable link to your home page. Using a 404 response to redirect missing page requests directly to your home page might also create a duplication risk.
If you plan on adding more subdomains over time, and might expect to have a large number of them, see the thread on rewriting arbitrary subdomains to subdirectories [webmasterworld.com]. By adding your existing exclusions, you can eliminate the need to write a new block of code for each new subdomain.
Jim
[edited by: jdMorgan at 3:32 am (utc) on Nov. 3, 2004]
You write "no GOOD way", does that imply that there is a BAD (but usable) way?
I see your point regarding the 404 path, and it's been changed.
I have been reading the <subdomain>.example.com/<path> to example.com/<subdomain>/<path> example with great interest, as my problem actually involves more domain names than indicated. Here's my slightly modified solution of your example:
ErrorDocument 404 /html/404.html
RewriteEngine on
RewriteBase/
# allow access to shared directories
RewriteCond %{REQUEST_URI} ^/icons/.* [OR]
RewriteCond %{REQUEST_URI} ^/images/.* [OR]
RewriteCond %{REQUEST_URI} ^/html/.* [OR]
RewriteCond %{REQUEST_URI} ^/cgi-bin/.*
RewriteRule ^(.*)$ $1 [L]
# Force host if none or IP
RewriteCond %{HTTP_HOST} ^$ [OR]
RewriteCond %{HTTP_HOST} 123\.123\.123\.123(:80)?$
RewriteRule (.*) http://www.xyz1.com/$1 [R=301,L]
# Add .www if missing
RewriteCond %{HTTP_HOST}-space-!^www\. [NC]
RewriteRule (.*) http://www.%{HTTP_HOST}/$1 [R=301,L]
# Rewrite (www.)<domain>.com/<path> to www.<domain>.com/<domain>/<path>
#
# Extract (required) domain (%2), and first path element (%4), discard www. (%1) and port number (%3) if present
RewriteCond %{HTTP_HOST}<>%{REQUEST_URI} ^(www\.)([^.]+)\.com(:80)?<>/([^/]*) [NC]
# Rewrite only when domain not equal to first path element (prevents mod_rewrite recursion)
RewriteCond %2<>%4-space-!^(.*)<>\1$ [NC]
# Rewrite to /domain/path
RewriteRule ^(.*) /%2/$1 [L]
I had to make a new subdir xyz1 and move all root documents there to make this work, but this solution even takes care of the external access prevention sections in my original suggestion (apart from direct access to [<domain>.com...]
Great stuff, again thanks a lot!
I would recomend against trying to "force a domain if none" for the simple reason that the only time there will not be an HTTP_HOST variable defined is when there is no Hostname specified in the HTTP request header. And the only time that will happen is when the client is a true HTTP/1.0 client. Since HTTP/1.0 does not support the Hostname header, there is no way the client can request the Hostname you tried to redirect it to. So, you end up with a loop of another kind. WIth the host sending a redirect, and the HTTP/1.0 client trying to re-request the domain (but without sending the new hostname) and the server again rejecting the request and trying to redirect it...
The best approach may be to simply return a page that says, "Sorry, our shared-hosting server requires an HTTP/1.1 client, and your client is HTTP/1.0. Here's a stripped-down page for you, or please visit again with a newer browser."
Forcing a domain if an IP address is used is OK. As a matter of fact, you may want to also redirect requests if the "www" part is missing and no other subdomain is requested, i.e. "{HTTP_HOST} !^.+\.domain\.com"
Note that I did not end-anchor the domain name. If you end-anchor the domain name, some clients can cause your HTTP_HOST-based rules to fail by appending a port number, e.g. "www.example.com:80" in which case the RewriteCond will fail if it is end-anchored.
Jim
[xyz1.com...] and [xyz1.com...] both display the same document (/home/user1/xyz1/index.html). [xyz1.com...] shows a 404.
I'm using Apache 2.0.40. Can you spot the problem?
ErrorDocument 404 /scripts/404.html
RewriteEngine on
RewriteBase/# allow access to shared directories
RewriteCond %{REQUEST_URI} ^/icons/.* [OR]
RewriteCond %{REQUEST_URI} ^/images/.* [OR]
RewriteCond %{REQUEST_URI} ^/scripts/.* [OR]
RewriteCond %{REQUEST_URI} ^/cgi-bin/.*
RewriteRule ^(.*)$ $1 [L]# force domain if IP
RewriteCond %{HTTP_HOST} 123\.123\.123\.123(:80)?$
RewriteRule (.*) [xyz1.com...] [R=301,L]# force www.
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}--space--!^www\. [NC]
RewriteRule (.*) [%{HTTP_HOST}...] [R=301,L]# some documents used to be named .htm
RewriteCond %{REQUEST_URI} .*\.htm$
RewriteRule ^(.*)\.htm$ $1\.html [R=301,L]# Rewrite (www.)<domain>.com/<path> to www.<domain>.com/<domain>/<path>
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}<>%{REQUEST_URI} ^(www\.)([^.]+)\.com(:80)?<>/([^/]*) [NC]
RewriteCond %2<>%4--space--!^(.*)<>\1$ [NC]
RewriteRule ^(.*) /%2/$1 [L]
Fixed a typo and a problem in the above example:
# some documents used to be named .htm
RewriteCond %{REQUEST_URI} .*\.htm$
RewriteRule ^(.*)\.htm$ [%{HTTP_HOST}...] [R=301,L]
Will try your suggestion.
# some documents used to be named .htm
RewriteCond %{REQUEST_URI} .*\.htm$
RewriteRule ^(.*)\.htm$ http://%{HTTP_HOST}/$1.html [R=301,L]
# some documents used to be named .htm
RewriteRule ^(.*)\.htm$ http://%{HTTP_HOST}/$1.html [R=301,L]
Options -MultiViews
took care of the problem. Thanks for all your help!
Yet another question related to the (www.)<domain>.com/<path> to www.<domain>.com/<domain>/<path> fix:
I'm trying to prefix my directories with some letters, numbers or special signs (might be neccesary to quickly resolve the potential problems this thread is about...), but can't seem to get it to work:
[fixed]
RewriteRule ^(.*) /PREFIX%2/$1 [L]
[fixed]
Obveously I've renamed the directories /home/user1/htdocs/PREFIXxyz1/ etc. What am I missing here? Feel I should know this :-)
Uploaded our new solution to my server today, but got a 500 error and had to revert to the original (first post) .htaccess code. The server is running 1.3.29 on a FreeBSD system. I have no access to the httpd.conf file to see what might be the trouble, and I am using Options -MultiViews +FollowSymlinks
Any idea why the following works on my test servers, but not on the production server?
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}<>%{REQUEST_URI} ^(www\.)([^.]+)\.com(:80)?<>/([^/]*) [NC]
RewriteCond %2<>%4!^(.*)<>\1$ [NC]
RewriteRule ^(.*) /%2/$1 [L]
ErrorDocument 404 /scripts/404.html
Options -MultiViews +FollowSymlinks
RewriteEngine on
RewriteBase/# allow access to shared directories
RewriteCond %{REQUEST_URI} ^/icons/.* [OR]
RewriteCond %{REQUEST_URI} ^/images/.* [OR]
RewriteCond %{REQUEST_URI} ^/scripts/.* [OR]
RewriteCond %{REQUEST_URI} ^/cgi-bin/.*
RewriteRule ^(.*)$ $1 [L]# force domain if IP
RewriteCond %{HTTP_HOST} 123\.123\.123\.123(:80)?$
RewriteRule (.*) [xyz1.com...] [R=301,L]# force www.
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}--space--!^www\. [NC]
RewriteRule (.*) [%{HTTP_HOST}...] [R=301,L]# some documents used to be named .htm
RewriteRule ^(.*)\.htm$ [%{HTTP_HOST}...] [R=301,L]# Rewrite (www.)<domain>.com/<path> to www.<domain>.com/<domain>/<path>
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}<>%{REQUEST_URI} ^(www\.)([^.]+)\.com(:80)?<>/([^/]*) [NC]
RewriteCond %2<>%4--space--!^(.*)<>\1$ [NC]
RewriteRule ^(.*) /%2/$1 [L]
# allow access to shared directories
RewriteCond %{REQUEST_URI} ^/icons/.* [OR]
RewriteCond %{REQUEST_URI} ^/images/.* [OR]
RewriteCond %{REQUEST_URI} ^/scripts/.* [OR]
RewriteCond %{REQUEST_URI} ^/cgi-bin/.*
RewriteRule ^(.*)$ $1 [L]
This is especially useful if you need to change the RewriteBase because of the way the public server is configured to handle "accounts".
Maybe you want:
# allow access to shared directories [b]by skipping the rules below this ruleset[/b]
RewriteCond %{REQUEST_URI} ^/icons/ [OR]
RewriteCond %{REQUEST_URI} ^/images/ [OR]
RewriteCond %{REQUEST_URI} ^/scripts/ [OR]
RewriteCond %{REQUEST_URI} ^/cgi-bin/
RewriteRule [b].* - [L][/b]
Comments:
This needs a start anchor to improve efficiency:
# force domain if IP
RewriteCond %{HTTP_HOST} [b]^1[/b]23\.123\.123\.123(:80)?$
RewriteRule (.*) http://www.xyz1.com/$1 [R=301,L]
# some documents used to be named .htm
RewriteRule ^[b]([^.]+)[/b]\.htm$ http://%{HTTP_HOST}/$1.html [R=301,L]
#
# Rewrite (www.)<domain>.com/<path> to www.<domain>.com/<domain>/<path> [i]except for shared directories[/i]
RewriteCond %{REQUEST_URI} !^/icons/
RewriteCond %{REQUEST_URI} !^/images/
RewriteCond %{REQUEST_URI} !^/scripts/
RewriteCond %{REQUEST_URI} !^/cgi-bin/
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}<>%{REQUEST_URI} ^[b]www\.[/b]([^.]+)\.com(:80)?<>/([^/]*) [NC]
RewriteCond [b]%1<>%3[/b] !^(.*)<>\1$ [NC]
RewriteRule ^(.*) [b]/%1/$1[/b] [L]