Forum Moderators: phranque

Message Too Old, No Replies

mod rewrite works in httpd.conf file but NOT in .htaccess files

         

toneman

4:00 pm on Apr 19, 2008 (gmt 0)

10+ Year Member



Hello and I apologize if this has been answered but I have spent 3 days on this and am ready to go back to pen and paper...

Simple application - single rewrite rule - works when in the httpd.conf file but NOT in the .htaccess file. Here is the rule:

RewriteRule ^/$ http://example.com/ht_test/ [L]

I have performed the following to make sure that the .htaccess should work fine:

1) Verified that "AllowOverride All" is set for my <Directory> in question

2) Verified that the .htaccess file is being read - used a "garbage" test to verify file read.

3) Verified that the .htaccess pattern is being applied - checked this is the mod logs.

4) Verified that NO OTHER CONF files are being applied that could overwrite my <Directory> Settings.

Again, when the above rule is in my <VirtualHost> section, it works great. As soon as I comment those out and move them to the .htaccess file, all I get is the standard index.php file instead of the redirected version.

Ideas please - Thanks.

[edited by: jdMorgan at 4:50 pm (utc) on April 19, 2008]
[edit reason] example.com [/edit]

jdMorgan

4:49 pm on Apr 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



.htaccess is a per-directory config file. Therefore, the path to the current directory --in this case "/"-- is not present in the URL-path "seen" by RewriteRule.

Your rule will probably work if you change the pattern to "^$".

Jim

Doood

4:50 pm on Apr 19, 2008 (gmt 0)

10+ Year Member



Could it have anything to do with "/" between ^$

toneman

4:57 pm on Apr 19, 2008 (gmt 0)

10+ Year Member



So I feel like a complete idiot - both of you guys are correct but this highlight my lack of understanding re: th path and what is used where.

What am I missing here? Is the "/" present in the URI for the .conf file and not in the per directory checking?

Thanks by the way...hair is growing back now...

jdMorgan

5:09 pm on Apr 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> What am I missing here? Is the "/" present in the URI for the .conf file and not in the per directory checking?

Yes, you use the complete URL-path in code in httpd.conf or conf.d, etc. In .htaccess, you must remove that part of the URL-path that was used to get to the current .htaccess directory from the RewriteRule pattern. In other words, in a per-directory context, the pattern must be "localized" to the directory in which the .htaccess file is located. This is documented in the Apache mod_rewrite docs -- in the notes following the description of the RewriteRule directive.

Example to internally rewrite /foo/bar/widget.html to /foo/bar/widget.php, use:

In httpd.conf : RewriteRule ^/foo/bar/widget\.html$ /foo/bar/widget.php [L]
In /.htaccess : RewriteRule ^foo/bar/widget\.html$ /foo/bar/widget.php [L]
In /foo/.htaccess : RewriteRule ^bar/widget\.html$ /foo/bar/widget.php [L]
In /foo/bar/.htaccess : RewriteRule ^widget\.html$ /foo/bar/widget.php [L]

Jim

toneman

5:13 pm on Apr 19, 2008 (gmt 0)

10+ Year Member



jdMorgan - many thanks - one last question since you obviously know this cold - the "%" and the "$". i know the latter can be used in regex as the end of a line but it also seems to be used as a back-reference - in fact i think they both are but used for different rules - the "%" for RewriteCond and the other one for RewriteRule. the questions I have are:

1) Are they used just to reference back to content contained within parens?

2) How do the numbers work? For example, when I see "$3" or "%2", what are these referencing?

Thanks again...

jdMorgan

5:25 pm on Apr 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



1) Yes. $n refers to the value matching the nth parenthesized sub-expression in the RewriteRule pattern, while %n refers to the value matching the nth parenthesized sub-expression in the last-matched RewriteCond.

Where parenthesized sub-expressions are nested, count left parentheses to determine the back-reference number.

You may use $1 through $9 and/or %1 through %9 -- all other values are undefined. Rules requiring more than nine backreferences of either type must be broken down into several steps using multiple RewritRules and/or RewriteConds.

If you read the documentation strictly rather than liberally, you should be trouble-free. That is, if the documentation says it won't work, or does not say that it will work, then it probably won't work.

Jim

toneman

5:43 pm on Apr 19, 2008 (gmt 0)

10+ Year Member



ok - great - so this is what is really confusing - to me at least. In the following, it is not clear what "%2" and "$1" refer to - is it clear to you?

The first occurrence of "%2" is especially confusing because I would expect to find 2 sets of "(SOME_TEXT)" in one of the preceding lines but as you can see, it is not there...Last question - I promise...

# Fix missing trailing slashes.
RewriteCond %{HTTP_HOST} !^(www\.)?domain\.com$ [NC]
RewriteCond %{HTTP_HOST} ^(www\.)?([^\.]+)\.domain\.com$ [NC]
RewriteCond %{DOCUMENT_ROOT}/%2%{REQUEST_URI}/ -d
RewriteRule [^/]$ %{REQUEST_URI}/ [R=301,L]

# Rewrite sub domains.
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{HTTP_HOST} !^(www\.)?domain\.com$ [NC]
RewriteCond %{HTTP_HOST} ^(www\.)?([^\.]+)\.domain\.com$ [NC]
RewriteRule ^(.*)$ /userpages/%2/$1 [QSA,L]

jdMorgan

7:00 pm on Apr 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The %2 in the first rule refers to the subdomain of the hostname tested in the previous RewriteCond. Also note that if a "www." precedes the subdomain, it is discarded for the purposes of this rule. So, if www.foo.example.com/bar or foo.example.com/bar (note no trailing slash) is requested, and that URL-path resolves to an existing directory at <document_root>/foo/bar/, then the request is redirected to add the trailing slash.

In order to canonicalize the hostname to avoid duplicate-content problems and dilution of PageRank and Link-popularity, I'd suggest changing that first rule to:


RewriteRule [^/]$ http://%2/example.com%{REQUEST_URI}/ [R=301,L]

and following it with:

# canonicalize hostname; remove leading "www" from subdomain & remove any trailing "." or port numbers
RewriteCond %{HTTP_HOST} !^([^\.]+)\.example\.com$
RewriteCond %{HTTP_HOST} ^(www\.)?([^\.]+)\.example\.com
RewriteRule (.*) http://%2.example.com/$1 [R=301,L]

I assume here that you prefer www.example.com to example.com as your canonical domain name. If not, this rule will need to be a bit more complicated.

Note that the path being checked for directory-exists in your first rule is inconsistent with the path in your second rule. If these rules are intended to work together, then the third RewriteCond in the first ruleset should probably also include the "/userpages" path-part, unless that path-part is already defined as part of DocumentRoot:


RewriteCond %{DOCUMENT_ROOT}/userpages/%2%{REQUEST_URI}/ -d

A comment: You will discover after some experience that mod_rewrite is not all that difficult; What is difficult is discovering and understanding what can and should be done with it to avoid or cure problems with search engine indexing of your site, and then making a rule to remember each of these things. In the case here --the canonicalization issue-- my mnemonic is "One Web resource, one URL" -- Any page, image, or object on your site should be directly-reachable at one and only one URL; All other 'variant' URLs should be permanently redirected to the single 'correct' URL.

So, for example, if your home page URL resolves to a file named "index.php", then
http://example.com.:80/index.php, which is a perfectly-valid URL, should be redirected to
http://www.example.com/
to avoid trouble with search engine indexing and ranking.

Have fun!

Jim