Forum Moderators: phranque

Message Too Old, No Replies

Clean Urls Without Mod Rewrite

Clean URLs not Working Properly on GoDaddy Grid

         

Steven Davis

3:51 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



My website is currently hosted on the Godaddy Gird and they are doing something that forces extensionless urls to look for a .html version of the page. However, my files are done in php and I was using the following code in htaccess:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.php [L,QSA]

to default extensionless urls to php, but their configuration seems to be ignoring this command. First, how are they setting the default file type? Second, is there a way I can use a ForceType or SetType Appache command on htaccess to override their default file type.

jdMorgan

3:56 pm on Aug 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What is the result if you do this:

Options +FollowSymLinks -MultiViews
RewriteEngine on
#
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.php [L]

Jim

Steven Davis

4:00 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



It works! Thank you so much. Can you explain what "Options +FollowSymLinks -MultiViews" means or does?

Steven Davis

4:28 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



The reason I ask exactly what Options +FollowSymLinks -MultiViews does though it works it now somehow does not allow this to work:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^(.+)/$
RewriteRule ^(.+)/$ /$1 [R=301,L]

This rewrite is meant to strip any trailing slashes from the end of our urls. So I have seen some explanations of FollowSymLinks on this forum, but I do not understand how it would interfere with the expression listed above.

Steven Davis

6:08 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



Sorry for making multiple posts, but it seems I've figured this out on my own. Evidently some .htaccess commands are sensitive to written order so that placing

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.php [L,QSA]

before the

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^(.+)/$
RewriteRule ^(.+)/$ /$1 [R=301,L]

does not work when Options +FollowSymLinks -MultiViews is on
however, if you switch them so that the trailing slash command comes first then the commands work.

I did not relize that .htaccess was sensitive to written order.

jdMorgan

6:28 pm on Aug 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For best 'portability', always specify a canonical URL in redirect rules.

Also, you only need one RewriteEngine on directive, before the first (any other) mod_rewrite directive.

And because you only care about slashes at the end of the requested URL-path, you can eliminate the second RewriteCond, here -- It's entirely redundant. So your last rule posted above becomes:


RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ [b]http://www.example.com[/b]/$1 [R=301,L]

Options +FollowSymLinks or Options +SymLinksIfOwnerMatch is required to enable mod_rewrite (see Apache mod_rewrite documtation).

Options -MultiViews disables content-negotiation, which can badly interfere with mod_rewrite by unexpectedly rewriting requested URLs itself (see Apache mod_negotiation for details). It also costs CPU time. So, if you don't need it, turn it off.

Jim

g1smd

7:31 pm on Aug 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Let's see your full final code to make sure you got all the corrections fixed up...

Steven Davis

7:50 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



The full Options +FollowSymLinks -MultiViews seemed to be necessary as the commands did not work without it, but here is what I'm now using. If corrections are needed please let me know.

### Canonical URL ###
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

### Remove Trailing Slash ###
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^(.+)/$
RewriteRule ^(.+)/$ /$1 [R=301,L]

### Remove Double Slashes ###
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]

### Reroute Extensionless File Request to a PHP file ###
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.php [L,QSA]

### Remove PHP if the user adds it ###
RewriteCond %{THE_REQUEST} ^GET\ /([^/]+/)*[^.]+\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.php$ http://www.example.com/$1 [R=301,L]

### Disallow Image Hotlinking ###
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?example\.com/ [NC]
RewriteRule .*\.(jpe?g¦gif¦bmp¦png)$ - [F]

jdMorgan

8:28 pm on Aug 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Those rules are ordered almost completely backwards, and you've apparently ignored what I said about using a canonical URL and optimizing the trailing-slash rule above...

Options +FollowSymLinks -MultiViews
RewriteEngine on
#
### Disallow Image Hotlinking
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !^http://www\.example\.com
RewriteRule \.(jpe?g¦gif¦bmp¦png)$ - [F]
#
### Externally redirect to remove ".php" if the user adds it
RewriteCond %{THE_REQUEST} ^GET\ /([^/]+/)*[^.]+\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.php$ http://www.example.com/$1 [R=301,L]
#
### Externally redirect to remove double slashes
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . http://www.example.com/%1/%2 [R=301,L]
#
### Externally redirect to remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ http://www.example.com/$1 [R=301,L]
#
### Externally redirect non-canonical hostname requests to canonical
### domain (if not already done by one of the preceding rules)
RewriteCond %{HTTP_HOST} !=www.example.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
### Internally rewrite requests for URLs which do not resolve
### to physically-existing files or directories to a PHP file
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ $1.php [L,QSA]

Note that the rules are now in order: External redirects in order from most-specific patterns/conditions to least-specific, followed by internal rewrites, again in orderfrom most-specific to least-specific.

This prevents multiple chained redirects in cases where two or more 'errors' are present in the URL, and prevents 'exposing' your internally rewritten filepaths as URLs. The 403-Forbidden rule goes first, as there's no use wasting time redirecting to correct an unwelcome request.

Jim

Steven Davis

8:59 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



Jim, thank you so much I reordered the commands as you instructed and now completely understand the logic of doing so. Also, I changed the trailing slash command as instructed. I was not ignoring you the first time I simply did not understand it, but looking at the proper logic for all the commands together made what you were saying about the trailing slash become clearer.

I do have one final question. You were saying that "Options +FollowSymLinks -MultiViews" particularly "-MultiViews" costs CPU time; why is that?

More importantly, what is the difference between "Options +FollowSymLinks -MultiViews" and "Options +FollowSymLinks"? I read the documentation that you suggested but quite frankly found it inscrutable while your explanations are a lot easier to understand.

jdMorgan

9:19 pm on Aug 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The difference is that the former turns off MultiViews/content-negotiation.

In a nutshell, content-negotiation is a fairly-complex function that accepts the client URL request (as does mod_rewrite) and, if the requested URL does not resolve to a physically-existing file or directory, it looks at the user-agent's (browser's or robot's) HTTP Accept headers to see what language and encoding the user-agent has been set to 'prefer.' It then looks at the disk directory, to see what files might be served in response to this URL request, and tries to pick the best-match with the user-agent's preference settings.

In the case where only a single file exists that matches the requested URL in any way, then that is what gets served.

The mod_negotiation module has to do an awful lot of work (and possibly also go read the physical disk drive on the server) to implement all this functionality, so if you don't need all of this complication, it's best to turn it off.

And as can be seen from the description, it too can 'rewrite' URLs to server filepaths, and so can interfere with mod_rewrite -- or preclude it from running in the first place. Note that Apache ships with MultiViews disabled by default, but some hosts turn them on for all users so that they don't have to deal with 'trouble reports' from the few Webmasters who might want to use MultiViews but can't figure out how to use the Options directive to turn them on by themselves.

Jim

Steven Davis

9:34 pm on Aug 23, 2009 (gmt 0)

10+ Year Member



Thanks again and your explanation of MultiViews/content-negotiation makes sense and is much clearer than the Apache documentation (apache.org).