Forum Moderators: phranque

Message Too Old, No Replies

htaccess generalized skip condition needed

htaccess skip on compare root dir

         

Henrik Bechmann

2:41 am on Nov 13, 2006 (gmt 0)

10+ Year Member



In htaccess I am trying to write a general case for internally inserting a root directory matching the domain name, without showing the change to the user. I am very close, except for the line following the note "PROBLEM LINE" below. I'm basically (I think) looking for a way to expand a variable in a pattern string.

Can anyone help?

RewriteBase /
#collect core name
RewriteCond %{HTTP_HOST} ^www\.([^.]*)(\.ca¦\.com¦\.org)$ [OR]
RewriteCond %{HTTP_HOST} ^([^.]*)(\.ca¦\.com¦\.org)$
RewriteRule (.*)- [E=coreName:%1]
#insert request for index.html if no file requested
RewriteRule ^$ [%{HTTP_HOST}...] [R,L]
#for index.html file insert core directory
RewriteRule ^index.html$ %{ENV:coreName}/index.html [L]
#prevent recusion for redirection
RewriteRule index.html$ - [L]
#for recursion root directory already matches core domain name, so skip
#PROBLEM LINE - NOT GENERALALIZED:
RewriteRule ^osscommons - [S=1]
#would like to do this instead, for generalized case:
#RewriteRule ^%{ENV:coreName} - [S=1]
#but it doesn't work
#this line works if it is skipped when the root dir is already coreName, that is when recursed:
RewriteRule ^(.*)$ %{ENV:coreName}/$1

- Henrik

jdMorgan

3:40 am on Nov 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The easiest solution --and the only solution unless the OS of your server supports POSIX 1003.2 regular expressions-- is to prefix your domain-subdirectory names with a unique prefix -- so that you can stop the recursion easily.

For example, something like this:


# Rewrite domains except main domain to subdirectories if not already rewritten
RewriteCond $1 !^ddir_
RewriteCond %{HTTP_HOST} !^(www\.)?maindomain\.com
RewriteCond %{HTTP_HOST} ^(www\.)?([^.]+)\.com
RewriteRule (.*) /ddir_%2/$1 [L]
#
# 301-redirect direct client requests for domain-subdirectories back to root
# domains to prevent duplicate-content problems and cross-domain "snooping".
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /ddir_[^/]+
RewriteRule ^ddir_([^/]+)/(.*)$ http://www.$1.com/$2 [R=301,L]

Here, we 'tag' each subdirectory belonging to a domain with a prefix of "ddir_", making it a simple matter to prevent recursion and to identify attempts to directly request the domain-subdirectories.

Note: If your server does support POSIX 1003.2 extended regular expressions, and you are not worried about losing that support by changing hosts in the future, you can use the POSIX 1003.2 atomic back-references support to do "pseudo-compares" based on the fact that if A+A=A+B, then A=B.

I don't recommend this approach because it's not portable, but if you really feel you must use it, then try searching WebmasterWorld for "POSIX 1003.2 subdomain subdirectory rewriterule" -- as we discussed this method several years ago in the context of rewriting subdomains to subdirectories.

Also note: This kind of recursion only occurs in .htaccess. If you can move your code to httpd.conf or conf.d, it won't be a problem.

Jim

[edited by: jdMorgan at 3:41 am (utc) on Nov. 13, 2006]

Henrik Bechmann

6:52 am on Nov 13, 2006 (gmt 0)

10+ Year Member



Thanks so much Jim,

I was just looking at those old threads, but couldn't get the local backreferencing to work on my system, so I guess I don't have that POSIX regex.

So near and yet so far! My web host automatically creates directories for subdomains (a variant of the parked domains I would in some cases like to use and access).

Time to crawl into a corner, lick wounds, and consider options<grin>

Maybe a file marker?:

RewriteCond $1/%{ENV:coreName}.marker -f
RewriteRule ([^/]*) - [S=1]

A performance hit obviously, but deterministic? What do you think?

BTW I use htaccess...

Thanks again!

- Henrik

Henrik Bechmann

5:46 pm on Nov 15, 2006 (gmt 0)

10+ Year Member



The sentinel identifier file marker approach works to stop recursion.

So the following is (relatively) generic code that lets any number of parked domains point to their own master account subdirectories, while appearing to be independent (their data directories appear to be root directories).

So when I do osscommons.ca I get osscommons.ca/home/osscommons.html rather than osscommons.ca/osscommons/home/osscommons.html.

The only hack is that each domain directory needs a zero-byte file to stop recursion, named "<domaindirname>.identifier.DoNotTouch".

Here's the code:

RewriteEngine on
RewriteBase /
#subdomains are filtered out, they do not need processing
RewriteCond %{HTTP_HOST}!^(www\.)?([^.]*)(\.ca¦\.com¦\.org)$
RewriteRule (.*) - [L]
#the coreName, which is also the subdirectory name is extracted
RewriteCond %{HTTP_HOST} ^(www\.)?([^.]*)(\.ca¦\.com¦\.org)$
RewriteRule (.*) - [E=coreName:%2]
#add-on subaccounts are filtered out (only five are available with my
#main account), they do not need processing
RewriteCond %{ENV:coreName} ^dufferinpark$¦^parkcommons$¦^communitycommons$¦^businesscommons$¦^wikiwebsites$
RewriteRule (.*) - [L]
#where only the domain name is given by the user,
#add index.php or index.html as available
RewriteCond %{DOCUMENT_ROOT}/{ENV:coreName}/index.php -f
RewriteRule ^$ [%{HTTP_HOST}...] [R,L]
RewriteRule ^$ [%{HTTP_HOST}...] [R,L]
#if the identifier file is available through the current root
#directory, we are done, skip the next instruction to avoid recursion
RewriteCond %{DOCUMENT_ROOT}/$1/%{ENV:coreName}.identifier.DoNotTouch -f
RewriteRule ^([^/]*) - [S=1]
#this is what does the actual work/magic, silently insert
#required internal subdirectory
RewriteRule ^(.*)$ %{ENV:coreName}/$1

- Henrik

[edited by: Henrik_Bechmann at 5:58 pm (utc) on Nov. 15, 2006]