Forum Moderators: phranque

Message Too Old, No Replies

mod rewrite: removing "index.php"

I'm going crazy!

         

tata668

3:18 am on Apr 9, 2009 (gmt 0)

10+ Year Member



I'd like to redirect any request ending by "index.php" to the same request but with "index.php" removed.

Example:

http://example.com/test/index.php
to
http://example.com/test/

I think a rule like this may do the trick (rule #1):


RewriteCond %{REQUEST_URI} (.*)index.php$ [NC]
RewriteRule (.*) http://example.com%1 [L,R=301]

The problem is that I have another rule that takes any none existing file and point them to my site front controller: /index.php (rule #2):


RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) /index.php [L]

I've just learned that even if that second rule uses the "[L]" flag, an internal redirection is made and the rule #1 will be applyed again!

The problem is that I don't want to apply the rule #1 to requests that have been changed by rule #2, I only want to apply it to request that originally end with "index.php". If the user types a request ending with "index.php", I want rule #1 to be apply, but I don't want it to be applyed to requests internally modified to point to "/index.php"!

Any ideas?

Thanks a lot in advance, this drives me crazy!

eeek

4:15 am on Apr 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Wouldn't this work?


RewriteRule (.*)/index.php $1/ [L]

The problem is that I have another rule that takes any none existing file and point them to my site front controller: /index.php

Doctor it hurts when I do this...

So change that to redirect to what you really want, i.e. drop the index.php from it.

jdMorgan

4:49 am on Apr 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Prevent the first rule from processing previously-internally-rewritten requests by examining THE_REQUEST.
Prevent the second rule from rewriting requests for "/" by requiring at least one character after the slash in the final directory path-part:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.php[?]? [NC]
RewriteRule ^(([^/]+/)*)index\.php$ http://example.com/$1 [NC,R=301,L]
#
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+/)*.+$ /index.php [L]

Jim

tata668

2:09 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



eeek:
Almost everything must be redirected to /index.php, since this is the only real .php file in my webroot. It's in charge of the routing.

jdMorgan, I'm almost there! Thanks so much! But I think you mean %1 instead of $1, right?

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.php[?]? [NC]
RewriteRule ^(([^/]+/)*)index\.php$ http://example.loc/%1 [NC,R=301,L]

Right? Otherwise I get something like this in the browser (locally on windows):

http://example.loc/index.php
redirects to
http://example.loc/C:/www/example/webroot/

Now, using %1 it almost works..

http://example.loc/index.php
redirects to
http://example.loc/

and

http://example.loc/forum/index.php
redirects to
http://example.loc/forum/

But there is one case that doesn't seem to work!:

http://example.loc/threads/222/index.php
redirects to
http://example.loc/222/

Why is that? I don't think this is a case that would append since "222" here is not a directory in my framework, but the id of an item. So having "index.php" after it doesn't make sense anyway. But I'd like to understand why it doesn't redirect to:
http://example.loc/threads/222/ ?

Here are my complete rules:

# no index.php
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.php[?]? [NC]
RewriteRule ^(([^/]+/)*)index\.php$ http://example.loc/%1 [NC,R=301,L]

# no www.
RewriteCond %{HTTP_HOST}//s%{HTTPS} ^www\.(.*)//((s)on¦s.*)$ [NC]
RewriteRule ^ http%3://%1%{REQUEST_URI} [L,R=301]

# prevent image hotlinking
RewriteCond %{REQUEST_FILENAME} .*jpg$¦.*jpeg$¦.*gif$¦.*png$ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !.*example\.loc [NC]
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule (.*) /index.php?imghotlinking=%{REQUEST_URI} [L]

# everything to the front controller
RewriteCond %{REQUEST_URI} !\.(gif¦jpg¦jpeg¦png¦swf¦pdf)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+/)*.+$ /index.php [L]

Your help is really appreciated!

jdMorgan

2:44 pm on Apr 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, I meant $1 as I typed it.

If your server is showing a filepath in the URL as a result, then you have a serious server mis-configuration problem... The most likley cause being that DocumentRoot is not correctly specified in httpd.conf

If you want to use %1 as a work-around, then add an outer layer of parentheses, exactly as shown in the RewriteRule pattern.

And as a side-note, this line

RewriteCond %{REQUEST_FILENAME} .*jpg$¦.*jpeg$¦.*gif$¦.*png$ [NC] 

can be much-more efficiently-coded as
RewriteCond %{REQUEST_FILENAME} \.(jpe?g¦gif¦png)$ [NC] 

(The ".*" prefixes are just a waste of code space and CPU time. You can make the same pattern improvement in the subsequent image-related RewriteCond as well.)

Also re-code

 RewriteCond %{HTTP_HOST}//s%{HTTPS} ^www\.(.*)//((s)on¦s.*)$ [NC] 

as
 RewriteCond %{HTTP_HOST}>s%{HTTPS} ^www\.([^>]+)>((s)on¦s.+)$ [NC] 

for better performance (The "end-of-host" boundary is unambiguously-specified).

Jim

tata668

3:09 pm on Apr 9, 2009 (gmt 0)

10+ Year Member




If your server is showing a filepath in the URL as a result, then you have a serious server mis-configuration problem... The most likley cause being that DocumentRoot is not correctly specified in httpd.conf

Hummm...

Here's my actual vhost code, if you don't mind checking for errors... Your help is very appreciated!

The general DocumentRoot outside my vhost, in httpd.conf, is:


DocumentRoot "C:/_julien/www"

The vhost:


<VirtualHost 127.0.0.1:80>
DocumentRoot "C:\_julien\www\bangproject\webroot"
ServerName bang.loc
ServerAlias www.bang.loc

Alias /bang/dynjs/ "C:/_julien/www/bangproject/BangWritable/dynjs/bang/"
Alias /bang/dyncss/ "C:/_julien/www/bangproject/BangWritable/dyncss/bang/"
Alias /bang/ "C:/_julien/www/bangproject/Bang/webroot/"
Alias /dynjs/ "C:/_julien/www/bangproject/BangWritable/dynjs/"
Alias /dyncss/ "C:/_julien/www/bangproject/BangWritable/dyncss/"

<Directory ~ "C:/_julien/www/bangproject/(webroot¦Bang/webroot¦BangWritable/dyncss¦BangWritable/dynjs)">

# no directory listing
Options All -Indexes

# UTF-8
AddDefaultCharset UTF-8

<IfModule mod_rewrite.c>

RewriteEngine on

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.php[?]? [NC]
RewriteRule ^(([^/]+/)*)index\.php$ [bang.loc...] [NC,R=301,L]

RewriteCond %{HTTP_HOST}>s%{HTTPS} ^www\.([^>]+)>((s)on¦s.+)$ [NC]
RewriteRule ^ http%3://%1%{REQUEST_URI} [L,R=301]

RewriteCond %{REQUEST_FILENAME} \.(jpe?g¦gif¦png)$ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !.*bang\.loc [NC]
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule (.*) /index.php?imghotlinking=%{REQUEST_URI} [L]

RewriteCond %{REQUEST_FILENAME} \.(jpe?g¦gif¦png¦pdf)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+/)*.+$ /index.php [L]

</IfModule>
</Directory>
</VirtualHost>

Any other idea of what can cause that filepath to appear in the rewriting?

Thanks a lot.

jdMorgan

3:32 pm on Apr 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The Alias paths should probably be URL-paths, not full system filepaths. See Apache mod_alias documentation for additional relevant info and warnings (see "trailing slash" notes).

Also, since your mod_rewrite code is going into a server config file, URL-path patterns should start with a slash (unlike in .htaccess):

 RewriteRule ^(([^/]+/)*)index\.php$ http://bang.loc/$1 [NC,R=301,L] 

should be
 RewriteRule [b]^/(([/b][^/]+/)*)index\.php$ http://bang.loc/$1 [NC,R=301,L] 

Also, there is no need for the <IfModule mod_rewrite.c> container, unless you want this config file to fail silently if moved to a server which does not have mod_rewrite loaded.

Jim

tata668

4:22 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



jdMorgan,

When using


RewriteRule ^/(([^/]+/)*)index\.php$ [bang.loc...] [NC,R=301,L]

[bang.loc...]
doesn't anymore redirect to:
[bang.loc...]

---

I've read the mod_alias documentation and I don't see what I could change in my aliases. Their example [httpd.apache.org] is:


Alias /image /ftp/pub/image

# A request for [myserver...] would cause the server to return the file /ftp/pub/image/foo.gif.


How is this different from this, on windows:

Alias /bang/dynjs/ "C:/_julien/www/bangproject/BangWritable/dynjs/bang/"

?

I read the note about the trailing slash. It seems to be ok to use it for me since there will always be something after the "/bang/dynjs/".

Caterham

4:48 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



Any other idea of what can cause that filepath to appear in the rewriting?

The stripping of the per-directory prefix from r->filename (that is the translated physical path) might not work (=unsupported) in combination with a regEx'ed <Directory> section; hence your rule-pattern would match against the full filename instead of just a local filepath.

tata668

7:17 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



Caterham, I really try but I don't understand what you say. I find all this rewriting technology hard to understand well!

But I think I did find a version of my "no index.php" rule that works for me (based on ideas from jdMorgan):


RewriteCond %{THE_REQUEST} ^[A-Z]+\ (.*)/index\.php[?]? [NC]
RewriteRule (.*) [bang.loc%1...] [NC,R=301,L]

I don't understand everything, like why (.*) is replaced by a local path in this RewriteRule rule.. But at least using the %1 from THE_REQUEST I'm able to achieve what I want!

Thanks again eveybody.

I'm still listening carefully if you have more tips/comments.

Caterham

8:58 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



I don't understand everything, like why (.*) is replaced by a local path in this RewriteRule rule.

You don't match against an URL-path in your rule-pattern in directory context (<directory> sections etc., .htaccess files).

You're matching against a part of a physical path. The "part" of a path like C:/_julien/www/bangproject/Bang/webroot/something is created by removing its prefix (and adding a path-info component, if such a component was left by the directory walk). The prefix would be C:/_julien/www/bangproject/Bang/ if you'd place the rules into <Directory "C:/_julien/www/bangproject/Bang"> or a .htaccess file C:/_julien/www/bangproject/Bang/.htaccess. The part of the path tested against your rule would be webroot/something in that case.

But since you don't use such a "plain" directory section but regular expressions nothing is striped from the full path at all. Hence your rule-pattern matches against the full path and if you reference the pattern ($1) in your substitution you're inserting the (matched) full path.

[edited by: Caterham at 9:01 pm (utc) on April 9, 2009]

tata668

9:23 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



Ok, I think I get it!

Does this mean I could then remove that prefix by myself, using a regular expression in the RewriteRule? A regex that would be aware of the fact that the prefixe is still present?

I'm asking because I'm curious but I think I'll stay with my current solution that works great.

g1smd

9:25 pm on Apr 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You could remove it if present. Make sure that the bit to be removed is made optional, so that when the underlying configuration error is fixed, the rule will still work.

tata668

9:34 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



But then again I don't get where my configuration error might be! Everything is working fine...

Caterham

11:51 pm on Apr 9, 2009 (gmt 0)

10+ Year Member



Does this mean I could then remove that prefix by myself, using a regular expression in the RewriteRule?

Yes. But you can stay with your rule. What you should not do is to rewrite parts of the request string into the substitution for internal rewrites (the_request is not URL-decoded and/or not normalized).

But then again I don't get where my configuration error might be!

You can't fix that (not removing the per-dir prefix by mod_rewrite automatically) unless you don't use a directory container with a regular expression.

tata668

12:06 am on Apr 10, 2009 (gmt 0)

10+ Year Member



Thank you Caterham!