Forum Moderators: phranque
I am having one very strage issue with adding a trailing slash after a directory.
Since I am using .htaccess rewrites for most of my urls on the site example 1 (mydomain.com/directory1/directory2) is not the same as example 2 (mydomain.com/directory1/directory2/). The rewrite I have works fine for these cases however there is a problem when I request directory1 without the trailing /.
The .htaccess is located in the root of the directory1 and I have the following code which when I take out of the my .htaccess the directory1 adds the slash and works perfectly. According to the code below it should be affected by this code, however it is trying to load mydomain.com/directory1//home/public_html/directory1/. Kinda doesn't make sense to me?
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_URI}!.html
RewriteCond %{REQUEST_URI}!.php
RewriteCond %{REQUEST_URI}!(.*)/(.*)/$
RewriteRule ^(.*)/(.*)$ http://example.com/directory1/$1/$2/ [R=301,L]
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_URI}!.html
RewriteCond %{REQUEST_URI}!.php
RewriteCond %{REQUEST_URI}!(.*)/$
RewriteRule ^(.*)$ http://example.com/directory1/$1/ [R=301,L]
Also what is preferrable, R=301,L or L,R=301?
I actually will need the rewrite to add this trailing slash for any depth on my site. Is there a method of doing it without having to make a case for each depth?
Thank you kindly for your time,
Nick
[edited by: nickCR at 10:56 pm (utc) on July 12, 2007]
# If Requested URI does not end with ".html", ".php", or "/"
RewriteCond %{REQUEST_URI} !(\.html¦\.php¦/)$
# and if it does not resolve to an existing file
RewriteCond %{REQUEST_FILENAME} !-f
# append a trailing slash
RewriteRule ^(([^/]+/)*)(.*)$ http://example.com/directory1/$1$3/ [R=301,L]
Replace the broken pipe "¦" characters with solid pipe characters before use; Posting on this forum modifies the pipe characters.
To resolve nested parentheses (as above) to back-references, count the left parentheses.
The order of the flags [R=301,L] makes no difference. I prefer the order shown here, since it corresponds to "time and order of application" and to the examples given by the author of mod_rewrite, and that's how I learned to order them. Otherwise, it is wholly a matter of style.
Jim
I really liked your response, thank you.
I would like to ask for your help understanding this part of the code:
^(([^/]+/)*)(.*)$ http://example.com/directory1/$1$3/
Don't quite capture the meaning of ([^/]+/), it looks like it has something to do with the slash but not really certain.
I'm more thrown off by the *)(.*) which should in my experience make $1$2 not $1$3. I just don't understand why $2 isn't used?
Good to know the [R=301,L] can be ordered either way.
I didn't know you could make several cases on one line however it makes writing these rules much more efficient as you explained. With this knowledge i'll attack some of the other longer than necessary rules I have.
Can you suggest any good articles on "optimizing" the .htaccess code, in one file I have over 200 lines of rules which I feel may be either longer then necessary or in the wrong order.
Thanks Again.
Nick
I would like to ask for your help understanding this part of the code:^(([^/]+/)*)(.*)$ http://example.com/directory1/$1$3/
Don't quite capture the meaning of ([^/]+/), it looks like it has something to do with the slash but not really certain.
I'm more thrown off by the *)(.*) which should in my experience make $1$2 not $1$3. I just don't understand why $2 isn't used?
[^/] means any non-'/' character.
([^/]+/) means one or more non-'/' characters followed by a single '/' character in a marked expression or block.
the marked expressions are labeled by counting left parentheses and can be nested.
$1 is (([^/]+/)*)
$2 is ([^/]+/)
$3 is (.*)
This final "not ending in a slash" is implicit in the construct: If the final URL-part did end in slash, all of its characters would have been matched into the preceding part of the pattern, leaving the last subpattern with nothing to match, and $3 blank as a result. And the "but as many as possible" is implicit in the behaviour of the "*" quantifier, which matches zero or more of anything, but is 'greedy' and will always match as many as possible.
So, the outer parentheses (back-referenced as $1) will therefore contain the entire directory-path up to and including the last slash before any "filename" (any final substring not ending in a slash). The inner parentheses, which could be back-referenced as $2, would contain only the last subdirectory level found, which is why it isn't used. That's also why I said "count left parentheses to determine the back-references."
I'm not aware of any books or articles on optimizing mod_rewrite. I developed my opinions from reviewing mod_rewrite's source code, from understanding how regular expressions are processed, and from having written command-parsing routines many years ago. Because of the complexity of mod_rewrite and regular-expressions, because of the millions of ways it might be applied, and because of the difficulty in even trying to name some of the elements and concepts involved, writing about it might be a daunting task. So mostly, you'll just find general rules of thumb on the subject. I believe in "making the computer do the work" and prefer readable code over maximally-efficient code -- as long as it's reasonable efficient. Therefore, I limit my efficiency crusade to warnings about multiple ".*" subpatterns in patterns, avoiding unexpected rule recursion, and using start- and/or end-anchoring whenever possible to avoid ambiguity.
Perhaps a Google search may turn up something useful.
Jim
Only for use in httpd.conf, conf.d, or some other server-config-level file. Never in .htaccess, where the path to the current directory is always stripped from the URL-path seen by RewriteRule. See note concerning "full URL-path" in RewriteRule documentation notes section.
Jim
This will match to any directory level correct?
Let me see if I understand all this right, please correct me if i'm wrong.
Instead of using (.*)(.*) which would match anything we use ([^/]+/) which specifically searches for anything that "doesn't" include a back-slash thus the [^/]. The +/ from what I understand allows a / within the string as long as it's not on the end, which allows this to be used for any level on the site.
Just trying to clearly understand this sytax so i'm not just "copying & pasting"
Thanks again.
Nick
They do not load one then the other right? For example:
if i'm in root it will load the .htaccess in the root folder.
if i'm in root/dir1/ it will load the .htaccess in dir1 not the .htacess from root and .htaccess from dir1?
As for the regex pattern, I described it fully above. For more information about regular expressions, see the tutorial cited in our forum charter [webmasterworld.com].
Jim