Forum Moderators: phranque

Message Too Old, No Replies

Hiding php extension on url causes server misconfiguation error

         

jorg

6:47 am on May 10, 2009 (gmt 0)

10+ Year Member



First, I apologize for bypassing the email filter on the registration. I'm new to running my own server, and I haven't yet set up my mail server.

I'm trying to make my urls look as clean as possible. The end result I want
http://example.com/page
Instead of
http://www.example.com/page.php

I've already removed the www dot. Now I just need to hide the php extension on my pages.

Through htaccess this is what I'm trying to use:

RewriteEngine on
# Remove the www dot
RewriteCond %{HTTP_HOST} !^example.com$ [NC]
RewriteRule ^(.*)$ http://example.tld/$1 [R=301,L]

# Everything above works fine
# Something below causes a server misconfiguration error
# Remove the dot php
RewriteBase /
RewriteRule ^([^.]+)\.php$ $1

Unfortunately when I do that, I get a server misconfiguration error. Am I doing something wrong, or is there a php or apache setting I need to change to allow this?

I'm on Ubuntu 8.10 with Apache 2.2.9 and PHP 5.2.6. Any help is appreciated.
Thank you.

[edited by: jdMorgan at 1:19 pm (utc) on May 12, 2009]
[edit reason] example.com [/edit]

g1smd

8:31 am on May 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your rule is exactly backwards.

You need the extensionless URL path part in your pattern (i.e. without the

.php
part), then rewrite that to the internal filename as
/$1.php
.

*** Now I just need to hide the php extension on my pages. ***

Be aware that RewriteRule when used for rewriting does not 'make' URLs. It merely accepts URL requests from the web and matches them to internal filenames. That is, if you wish users to 'see' and 'use' extensionless URLs, you need to use those extensionless URLs in the links in the pages of your site. It is links that 'define' URLs.

jorg

8:57 am on May 10, 2009 (gmt 0)

10+ Year Member



Thank you for your response.

I misunderstood the purpose of RewriteRule. I can already load pages without the php extension, but I would like the extension to be automatically removed. That way anyone that has bookmarked my page with the php extension will be directed to the url without the extension.

If the page displays both with and without the extension, can search engines mistake that as duplicate content? And will they index the pages both with and without the extension?

I would make it so any filename without an extension is parsed for php, then rename all my pages without the php extension. But I want people to still be able to access the page if they typed it with the extension (for example people who have bookmarked pages).

Any suggestions are welcome, and thanks.

g1smd

9:04 am on May 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ah, once you have the rewrite working (which matches the URL /xyz in an external request to the file /xyz.php on the server), you then also need to add a separate redirect so that anyone asking for the URL /xyz.php is redirected to ask for a new URL of www.example.com/xyz instead. That redirect must be a 301 redirect, must use RewriteRule (do not use plain Redirect or RedirectMatch here) with the [R=301,L] flags, and must force the canonical domain within the redirect. The redirect must be listed before the rewrite.

So there are two steps:
- redirect wrong URL to right URL
- take right URL request and match it to the internal file.

There are a very large number of previous examples of the complete process, including the steps you need to take to prevent an infinite loop.

As you have suspected, the additional redirect step is critical in preventing Duplicate Content issues.

*** I would make it so any filename without an extension is parsed for php, then rename all my pages without the php extension. ***

Be aware that the pysical files on your server MUST still have the .php extension otherwise the server will not know to handle them as being PHP scripts. It is the URLs that no longer feature an extension.

What the rewrite does is match a URL to a file. URLs are used 'out on the web'. Files are used 'inside the server'. These two things are merely 'related' or 'associated'. In this case you are associating the URL /xyz with the file /xyz.php and so on.

Usually, a request for the URL www.example.com/robots.txt will directly pull the file /robots.txt from inside the server. With a rewrite, it would pull some other file without revealing to the user 'out on the web' what the real internal name of that file actually is.

jorg

9:21 am on May 10, 2009 (gmt 0)

10+ Year Member



Thank you very much for your help. I appreciate the additional information. I think you've explained everything I need. Unfortunately it's 5am and I really need to get to bed. I look forward to getting it set up properly tomorrow evening.

Thanks again.

jdMorgan

2:28 pm on May 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The problem is that your new rule exactly countermands the existing mechanism that internally rewrites extensionless URLs to PHP filepaths. Therefore you get an 'infinite' rewrite/redirect loop which continues until either the server or the client gives up. This problem should be obvious in your server error log.

The solution is to redirect only direct client requests via HTTP for PHP files, instead of also redirecting internal .php-path requests arising from the action of your internal rewrite mechanism:


RewriteCond %{THE_REQUEST} ^[A-Z]+\ /[^.#?\ ]+\.php([#?][^\ ]*)?\ HTTP/
RewriteRule ^([^.]+)\.php$ http://example.com/$1 [R=301,L]

Here, the RewriteCond examines the original client request, exactly as it appears in your raw server access log, and allows the rewriterule to execute only if the request for the .php URL-path was received from the client.

Jim

jorg

5:40 am on May 11, 2009 (gmt 0)

10+ Year Member



You guys are great. I have it working perfectly now. Thanks. :)

g1smd

7:59 am on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Make sure you test it with www and non-www URLs, with and without extra parameters, maybe even http and https to make sure that every combination is covered. Using Live HTTP Headers you should see either a single step 301 to the canonical URL then '200 OK', or else directly see either '200 OK' or '404 Not Found'.

jorg

7:12 pm on May 11, 2009 (gmt 0)

10+ Year Member



I now have URLs set up how I wanted. <snip>

http://www.example.org/whois.php now redirects to http://example.org/whois. Which I like as it's shorter and looks cleaner. As you'll see, I kept my site design simple and clean like my urls.

I'm new to running my own server. The main purpose of my site is to host and share my files online conveniently. Recently I decided I wanted to make it an actual site for the public with some useful tools on it. I'd like it to be Search Engine friendly, which is why I included MetaTags and why I came to WebmasterWorld.

My site is valid XHTML 1.0 Transitional

I'm running my own nameservers (just because I wanted to learn how, and it looks nice in a whois), ns1.example.org, ns2.example.org. I haven't set up the rdns stuff yet.

I have a meta refresh to redirect 404 pages. I still have to set up my mail server. I've having fun learning to set up and run my own server, as well as learn php at the same time.

<snip>

Thanks.

[edited by: jdMorgan at 7:39 pm (utc) on May 12, 2009]
[edit reason] Removed specifics & review request. Please see TOS. [/edit]