Forum Moderators: phranque

Message Too Old, No Replies

Unwanted ".php" extension added after 301 redirect

mod_negotiation and domain redirect interfere with extensionless filenames

         

Lucas

1:23 am on Dec 20, 2006 (gmt 0)

10+ Year Member



My problem is that when people go to
http://www.example.com/page
they get redirected to
http://example.com/page.php
instead of
http://example.com/page

How do I fix this?

I am currently redirecting visitors to my site from

http://www.example.com/
to
http://example.com/
by adding the code below to my
.htaccess
file:

RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

I also have my

.htaccess
configured to hide
.php
and
.html
extensions. That is, people can access the content at
http://example.com/page.php
by going to
http://example.com/page
; my
.htaccess
for this reads:

AddHandler x-httpd-php5 php
AddType "text/html" html

jdMorgan

2:33 am on Dec 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's more to this than just the code you've posted. The AddType/AddHandler code is not sufficient to "map" your extension-less URLs to either .html or .php files.

So the question is, do you have any other rules in httpd.conf, conf.d, or .htaccess files that rewrite extensionless URLs to either .html or .php files?

If not, then this action is likely being taken by mod_negotiation, which is invoked prior to your domain redirect. If mod_negotiation changes the URL and then mod_rewrite does the domain canonicalization redirect, the file extensions will be "exposed" as you report.

A solution is to disable content-negotiation and replace its extensionless-URL-mapping action with some (relatively) simple mod_rewrite code, such that the canonicalization redirects take place before the extensionless-URL-to-filename mapping is done. This allows the latter function to be done "silently."

Before we go down that road, though, we might want to explore any other functions you may be using content-negotiation for.

See the Apache mod_negotiation documentation for background info if needed.

Jim
[edit] Speling [/edit]

[edited by: jdMorgan at 2:35 am (utc) on Dec. 20, 2006]

Lucas

2:56 am on Dec 21, 2006 (gmt 0)

10+ Year Member



I am using only
.htaccess
; here are its contents:
AddDefaultCharset utf-8
AddType "application/rdf+xml" rdf
AddType "application/xhtml+xml" xhtml
AddType "application/xml" xml
AddType "image/gif" gif
AddType "image/jpeg" jpg
AddType "image/png" png
AddType "image/svg+xml" svg
AddType "image/tiff" tif
AddType "text/css" css
AddType "text/html" html
AddType "text/javascript" js
AddType "text/plain" txt
AddType "text/xml" xsl
AddType "video/mpeg" mpg
AddHandler x-httpd-php5 php
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]
RewriteRule ^sitemap.xml?$ /sitemap.php
Redirect 301 /contact http://example.com/about
ErrorDocument 403 403.php
ErrorDocument 404 404.php

jdMorgan

3:57 am on Dec 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As discussed above, this modified version disabled MultiViews and replaces its extensionless filename mapping function. However, this is the only function replaced; If you are relying on MultiViews for other purposes, this code will not support them, and your site may break.

Also, some server configurations are 'unique' and the use of %{DOCUMENT_ROOT} below may or may not work properly on your server without some tweaking of the actual path being tested.

In short, use at your own risk and test thoroughly. Warranty expires 0358 GMT 21-Dec-2006.


AddDefaultCharset utf-8
#
AddType "application/rdf+xml" rdf
AddType "application/xhtml+xml" xhtml
AddType "application/xml" xml
AddType "image/gif" gif
AddType "image/jpeg" jpg
AddType "image/png" png
AddType "image/svg+xml" svg
AddType "image/tiff" tif
AddType "text/css" css
AddType "text/html" html
AddType "text/javascript" js
AddType "text/plain" txt
AddType "text/xml" xsl
AddType "video/mpeg" mpg
#
AddHandler x-httpd-php5 php
#
ErrorDocument 403 /403.php
ErrorDocument 404 /404.php
#
# Disable MultiViews to fix problem with domain canonicalization
# redirects exposing the internal script URL-paths.
Options -MultiViews
RewriteEngine on
#
# Redirect contact requests
RewriteRule ^contact(.*)$ http://example.com/about/$1 [R=301,L]
#
# Redirect all requests to canonical domain name (non-"www.")
RewriteCond %{HTTP_HOST} ^www\.example\.com(:[0-9]+)?$ [NC]
RewriteRule (.*) http://example.com/$1 [R=301,L]
#
# Rewrite sitemap.xml requests to sitemap.php script
RewriteRule ^sitemap.xml?$ /sitemap.php [L]
#
# Map any extensionless URL to php file if the corresponding php file exists
# (This replaces the similar functionality provided by MultiViews, now disabled above)
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.php [L]

Jim

Lucas

1:42 am on Dec 22, 2006 (gmt 0)

10+ Year Member



AMAZING! WE'RE SO CLOSE! THANK YOU!

But, now I'm getting 404 errors for non-html and non-php files without extensions:

http://example.com/print
used to show me the file at
http://example.com/print.css
but now I get my error page.

What do you think?

jdMorgan

3:04 am on Dec 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, the alternatives are to explicitly test for each filetype in turn, which is fairly expensive in terms of performance, or to use named filetypes for objects such as images, css files, external Javascripts, etc. -- stuff that is referenced or included by pages, but not intended to be accessed directly by their unique URLs or listed in search engine results.

A third alternative is to move all such file into subdirectories by filetype -- the subdirectory name would then serve in the role of the filetype identifier. I'm sure there are many other clever methods.

Otherwise, it's down to a series of rules like this:


RewriteCond %{DOCUMENT_ROOT}/$1.gif -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.gif [L]
#
RewriteCond %{DOCUMENT_ROOT}/$1.jpg -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.jpg [L]
#
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.php [L]
#
RewriteCond %{DOCUMENT_ROOT}/$1.html -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.html [L]
#
RewriteCond %{DOCUMENT_ROOT}/$1.css -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.css [L]
#
RewriteCond %{DOCUMENT_ROOT}/$1.js -f
RewriteRule ^(([^/]+/)*[^/.]+)/?$ /$1.js [L]

This is essentially what mod_negotiation does, and neither this nor mod_negotiation is very efficient, because we are explicitly 'searching' the filesystem for each filetype in turn.
For this reason, I prefer the approach outlined above: When an object has a specific and unique MIME-type, I use a named filetype. In case that's not clear, an example would be that .html pages, .shtml, and .php scripts used to generate HTML pages all return a MIME-type of "text/html". And so it makes sense to go 'extensionless' on such files. But a .gif is always a .gif, and a .css is always a css file. So for those, I'd just go with files names .gif and .css respectively, since there is no real benefit in 'hiding' those filetypes...

You can use code like the above, and add filetypes as needed. But I recommend that you consider it to be a transitional solution. Be sure to check your logs/stats, and put the rules in order from most-requested filetype to least-requested filetype for the sake of efficiency. Filesize doesn't matter; order the rules only by number of requests for each type. Also, if possible, minimize the number of filetypes used -- For example, if you have both .htm and .html pages, rename them all to one or the other as soon as possible.

Jim

Lucas

4:55 pm on Jan 7, 2007 (gmt 0)

10+ Year Member



Thanks for your help! It the redirects are working properly.

However, my favicon stopped working. I had been serving a

.png
as a favicon from
http://example.com/favicon.ico.png
, without explicitly referring to it with code. That is, I there was no
<link href="/favicon.ico" rel="shortcut icon">
in the
<head>
.

How do I:

  1. serve that same
    .png
    (not an
    .ico
    ) as before;
  2. serve the file from
    http://example.com/favicon.ico
    ;
  3. avoid content-negotiation problems with Internet Explorer;
  4. and (if possible) avoid browser-sniffing?

jdMorgan

6:22 pm on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Something like:

RewriteRule ^favicon\.ico$ /favicon.ico.png [L]

added at the end should do it.

Jim

Lucas

2:41 am on Jan 10, 2007 (gmt 0)

10+ Year Member



Thanks so much!