Forum Moderators: phranque

Message Too Old, No Replies

How to canonicalize .html to .php and /index.html to "/"

How can I do both at the same time?

         

suzukik

9:46 pm on Jan 23, 2009 (gmt 0)

10+ Year Member



One of my clients will canonicalize "/index.html" and "/".
I know I should write on .htaccess like this:

RewriteCond %{THE_REQUEST} ^.*/index.html
RewriteRule ^(.*)index.html$ http://www.example.com/$1 [R=301,L]

He has recently changed file extension of the pages from .html to .php.
(New .php pages have already indexed and cannot be reverted to .html.)

I quess I should write on .htaccess like this:


RewriteRule (.*).html $1.php [PT,L]

Can I write both description as they are? Or are some modifications necessary?

g1smd

11:55 pm on Jan 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You are in trouble, as what you could have done in advance is simply set up your server so that you continued to use .html URLs, and either had .html files parsed for the PHP scripting within them, or implemented a rewrite so that asking for a .html URL resulted in the content being delivered from the appropriate .php file without revealing what the name of that file actually was. You would have continued to use .html URLs in the links within your site.

There are a number of options available to you now, most of which are far from optimum.

Your new RewriteRule will allow each .php file to be accessed just the same whether the requested URL ends in .html or in .php and that will make things even worse. You have introduced Duplicate Content issues.

The optimum way would be to use .html URLs for everything again (and can be done, even if the files on the server have .php file names).

The second best option is to redirect .html URLs over to .php URLs so that visitors following old bookmarks etc will still be able to reach the content. There may be a ranking drop of 3 to 6 months caused by changing your URLs.

suzukik

1:10 am on Jan 24, 2009 (gmt 0)

10+ Year Member



I found the best solution is to make PHP run in the .html as you says:

AddType application/x-httpd-php .html

Then to canonicalize "/index.html" and "/".


RewriteCond %{THE_REQUEST} ^.*/index.html
RewriteRule ^(.*)index.html$ http://www.example.com/$1 [R=301,L]

I can redirect all .html to .php instead:


redirectMatch 301 ^/(.*)\.html$ http://www.example.com/$1.php

But canonicalization of "/index.html(php?)" and "/" fails because a loop occurs.

I'm asking him whether or not he can't really revert the extension.

P.S.
Precisely speaking, the client is not mine but my friend's. I would have never let him make such a troublesome mistake.

g1smd

1:33 am on Jan 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Never mix RewriteRule and RedirectMatch in the same .htaccess file.

Use RewriteRule for both, and have the redirect as first, and rewrite as last. Make sure that both rules have [L] on the end.

If you are using .html URLs then make sure that those are what are in the links on your pages, and then 301 redirect .php URLs to .html URLs and force www at the same time in the redirect.

If the files are .html that's all you need to do (as well as the AddType stuff above, of course).

If the files are .php you'll need a rewrite to take .html URL requests and translate them so that the content can be served from the .php file without revealing that the content came from a .php file.

See the forum library for the index redirect. It's been posted hundreds of times before.

jdMorgan

4:03 pm on Jan 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd suggest:

# Externally redirect direct client requests for index.xyz to "/"
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.(html?¦php)\ HTTP
RewriteRule ^(([^/]+/)*)index\.(html?¦php)$ http://www.example.com/$1 [R=301,L]
#
# Externally redirect all requests for .html or .htm URLs to .php
RewriteRule ^(([^/]+/)*[^.]+)\.html?$ http://www.example.com/$1.php [R=301,L]

Replace the broken pipe "¦" characters above with solid pipes before use; Posting on this forum modifies the pipe characters.

Jim

suzukik

3:16 am on Jan 27, 2009 (gmt 0)

10+ Year Member



Thank you again, jdMorgan.

But it doesn't work well.
Loop occurs.

Well, now I understand AddType stuff is the best solution.

Caterham

11:09 am on Jan 27, 2009 (gmt 0)

10+ Year Member



Well, now I understand AddType stuff is the best solution.

You'd like to invoke a handler via AddHandler and not set a MIME-type to parse php. Invoking handlers via AddType is wrong since 1996 (unfortunately the php handler continuous to accept "magic" MIME-types but it can cause unexpected behavior in pre-content handler processing).