homepage Welcome to WebmasterWorld Guest from 54.167.173.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
htaccess non-www redirect and subdir
non-www redirect in root affecting subdir's rewrites
Inspired




msg:3403985
 11:54 am on Jul 25, 2007 (gmt 0)

Hey all,

I have some non-www to www code in the root htaccess of a domain. I have now added a directory at www.example.com/dir/ which has its own .htaccess to load the index for the dir from another location.

The problem is, when visitng http://example.com/dir/, the url is rewritten to include the www AND the real location of the index eg www.example.com/dir.php

Here is my code in the dir:

RewriteEngine On
RewriteBase /dir/
RewriteRule ^$ /dir.php

any suggestions?

[edited by: Inspired at 11:59 am (utc) on July 25, 2007]

 

jdMorgan




msg:3404188
 2:47 pm on Jul 25, 2007 (gmt 0)

You need to be sure that the www-non-www redirect happens first, followed by the internal rewrite for your dir.php file.

If you rewrite to dir.php first, then the redirect will "expose" the internal redirect, which is what you report happening here.

Make sure that you use mod_rewrite for both functions; Do not use mod_access for the redirect and mod_rewrite for the internal rewrite. If you do that, you cannot control the order in which they will execute, and this kind of problem can occur.

Jim

Inspired




msg:3404205
 3:00 pm on Jul 25, 2007 (gmt 0)

Thats what I figured was wrong, so I've copied the non-www code over to the dir htaccess, as follows:

RewriteEngine On
RewriteBase /dir/

RewriteCond %{HTTP_HOST} ^example.(.*)
RewriteRule ^(.*)$ http://www.example.com/dir/$1 [R=301,L]

RewriteRule ^$ /dir.php

works great, except when the trailing slash isnt included (eg www.example.com/dir) which ends up revealing the servers local path to the file within the URL.

any other suggestions?

jdMorgan




msg:3404343
 4:34 pm on Jul 25, 2007 (gmt 0)

Is this a commercially-hosted Apache 1.x server? It sounds like your LoadModule list is not correctly ordered.

In the present case, it is mod_dir that is invoking a redirect to add the missing trailing slash, again exposing your previously-executed internal rewrite.

You can try moving all this code to root /.htaccess, or add a redirect rule *above* the www-non-www redirect to add the trailing slash if it's missing.

Two ways to do that: If your site URLs are well-organized, only filenames contain periods, and all filenames contain periods, then if a URL-path has no period and does not end with a slash, add one.

If on the other hand, you have some filenames without periods, or some of your directory URL-paths contain periods, then you'll have to use a RewriteCond "directory exists" check to see if the URL-path refers to a directory, and if so, add a trailing slash if missing. Unfortunately, this requires a call to the file manager, and so is very slow compared to the first method.

Jim

Inspired




msg:3404362
 5:00 pm on Jul 25, 2007 (gmt 0)

Thanks Jim,

Its an Apache 2.0 server, would it be easier to correct the ordering of the LoadModule list to solve the issue?

If not, I would try to add the trailing slash within the subdir htaccess.

jdMorgan




msg:3404382
 5:19 pm on Jul 25, 2007 (gmt 0)

No, I specified Apache 1.x because Apache 1.x uses the reverse order of the LoadModule list to determine execution order, and this "reverse order" thing trips up many server admins.

For example, when adding PHP, many times they just figure, "Well, I'll add it at the end -- That should be safe." Then they wonder why mod_auth (authorization/authentication), mod_rewrite, and all the others don't seem to work for PHP files... It's because PHP executes first, and goes straight to the content-delivery API phase, by-passing execution of all the other modules.

On Apache 2.x, module execution order is controlled by an internal priority scheme, and can't be changed unless you modify the Apache source and re-compile it.

To be very clear about all of the above: Each Apache module parses your .htaccess file(s) in turn, looking for directives that it understands and handles. So directives handled by any one specific module are executed in the order they appear in your .htaccess file. But you cannot control the order in which different module's directives execute by specifying their order in .htaccess -- The module execution order determines that.

So, for example, it makes no difference if you place all mod_alias directives first in your .htaccess file, followed by all mod_rewrite directives, or intersperse or reverse these directive groups instead; The server will execute all of one or the other module's directives first, as determined by the reverse LoadModule list order on Apache 1.x, or by the internal priority scheme on Apache 2.x

Jim

Inspired




msg:3404393
 5:46 pm on Jul 25, 2007 (gmt 0)

thanks for the detailed explanation jim.

after ruling out the apache fix, I tried adding the trailing slash within the subdir htaccess, which ends up redirecting to http://www.example.com//home/user/example.com/public_html/dir/

at this point im not really sure what to do, this is a rather comlex solution for a seemingly simple issue.

jdMorgan




msg:3404435
 6:06 pm on Jul 25, 2007 (gmt 0)

Just curious: Have you tried your code without the "RewriteBase"? It's not usually needed (I have never had to use it on any commercially-hosted server), and as shown by your redirect-gone-wrong, if it were actually needed, it would normally read something like:

RewriteBase /home/user/example.com/public_html

Correct your double-slash problem (as shown in the "bad" URL), and try commenting-out the RewriteBase

Jim

Inspired




msg:3404452
 6:38 pm on Jul 25, 2007 (gmt 0)

I made both those correction, but to no avail, here is the whole subdir htaccess file:

RewriteEngine On

# trailing slash
RewriteCond %{REQUEST_URI}!(.*)/$
RewriteRule ^(.*)$ http://www.example.com$1/ [L,R=301]

# non-www
RewriteCond %{HTTP_HOST} ^example.(.*)
RewriteRule ^(.*)$ [example...] [R=301,L]

RewriteRule ^$ /dir.php

I didnt add a check for periods in the filename yet, but i guess that shouldnt stop it from working

sc112




msg:3404470
 7:06 pm on Jul 25, 2007 (gmt 0)

Is this what you really have:

RewriteRule ^(.*)$ [example...] [R=301,L]

(example.com)

jdMorgan




msg:3404473
 7:13 pm on Jul 25, 2007 (gmt 0)

Well, just in case there's something funny lurking here, please try this version which mostly differs only in coding style:

# trailing slash
# If URL-path does not contain a period or end with a slash
RewriteCond %{REQUEST_URI} !(\./$)
# add a trailing slash
RewriteRule (.*) http://www.example.com/$1/ [R=301,L]
#
# Redirect non-www domain requests to www domain
RewriteCond %{HTTP_HOST} ^example\.
RewriteRule (.*) http://www.example.com/dir/$1 [R=301,L]
#
# Internally rewrite requests for index page to dir.php
RewriteRule ^$ dir.php [L]

Replace the broken pipe "" character in the first RewriteCond pattern with a solid pipe character before trying this; Posting on this forum modifies the pipe character.

Step #2 would be to add "RewriteOptions none" at the top. This will prevent your root .htaccess file from affecting this subdirectory (by disabling "RewriteOptions inherit"). Even if it has other negative side effects, finding out whether this helps with the server filepath exposure problem might be worthwhile.

Also, please be verbose about how you are testing, what specific results you get, and how those differ from what you expected. I can't look over your shoulder here and see what you're doing or seeing. I'm basically trying to watch the football match through a tiny hole in the side-wall of the stadium...

Jim

[edited by: jdMorgan at 7:14 pm (utc) on July 25, 2007]

Inspired




msg:3404499
 7:45 pm on Jul 25, 2007 (gmt 0)

thanks again jim,

I just tried your version, but this again resulted in the full server filepath being exposed with the double slash issue. I resolved the double slash issue by changing the first rewriterule (removed the slash after the tld) to:

RewriteRule (.*) http://www.example.com$1/ [R=301,L]

This still caused the filepath to be exposed when there was no trailing slash, although it works fine when there is no www but the last slash is included.

Adding "RewriteOptions none" at the top causes all requests to return a 500 Internal server error so some reason.

Not sure if this is related, but the root htaccess is rather bulky, around 600 lines or so of rewrites and redirects.

g1smd




msg:3404628
 9:39 pm on Jul 25, 2007 (gmt 0)

I have had this same problem recently. In the new code I tested for that presence of any extra path information in the rewrite, and then left that stuff out of the final rewritten or redirected target path.

See: [webmasterworld.com...] and [webmasterworld.com...] for more.

Inspired




msg:3405346
 3:25 pm on Jul 26, 2007 (gmt 0)

FINALLY, I've figured this out.

I thought that the solution was getting a little too complicated for such a seemingly trivial issue, so I started over again trying to simplify things.

The solution to not revealing the internal rewritten URL was to use {REQUEST_URI} instead of the regex backtrace $1 which was being replaced with the internal path.

I placed the following code in the subdir htaccess:

RewriteEngine On
# Redirect non-www domain requests to www domain
RewriteCond %{HTTP_HOST} ^example\.
RewriteRule (.*) http://www.example.com%{REQUEST_URI} [R=301,L]

This takes care of the 301 for non-www or no trailing slash, or both. No other code was needed in the subdir or the root htaccess.

Thanks for your help everyone!

Edit:
I forgot to mention that downside to this solution is a double redirect in case of a non-www request thats also missing the trailing slash.

[edited by: Inspired at 3:48 pm (utc) on July 26, 2007]

jdMorgan




msg:3405471
 5:55 pm on Jul 26, 2007 (gmt 0)

Well in that case, combining what you've found with the code above, you could try this to avoid the double redirect:

# trailing slash
# If URL-path does not contain a period or end with a slash
RewriteCond %{REQUEST_URI} !(\./$)
# add a trailing slash
RewriteRule .* http://www.example.com%{REQUEST_URI}/ [R=301,L]
#
# Canonicalize the domain
# Redirect non-www domain requests to www domain
RewriteCond %{HTTP_HOST} ^example\.
RewriteRule .* http://www.example.com%{REQUEST_URI} [R=301,L]

Thanks for posting your findings!

Jim

[edited by: jdMorgan at 5:56 pm (utc) on July 26, 2007]

g1smd




msg:3405537
 7:11 pm on Jul 26, 2007 (gmt 0)

Try to avoid the double redirect.

There are many indexing issues with having that there.

Inspired




msg:3406108
 9:50 am on Jul 27, 2007 (gmt 0)

Thanks jim, you last version does the job perfectly. All requests, including "http://example.com/dir" are handled in a single redirect.

I wanted to avoid having multiple redirects as g1smd mentioned, but I just noticed that apache does a double redirect for an actual subdir containing a static index.html with a "http://example.com/dir" request. That dir doesnt have a htaccess of its own, the first redir is the canonicalization code in the root htaccess followed by another apache redir adding the trailing slash... but thats probably fodder for another discussion.

Once again, thanks to everyone for their help.

jdMorgan




msg:3406388
 4:11 pm on Jul 27, 2007 (gmt 0)

That should not be happening. But I did forget to post the standard warning:

Change all broken pipe "¦" characters in the code above to solid pipes before use; Posting on this forum modifies the pipe characters.

If that doesn't help, then adding an .htaccess file to the subdir with "RewriteOptions inherit" in it may help.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved