homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

Apache Rewrite to remove file extensions
Using Apache rewrite to remove file extensions and how Google will react

 10:25 pm on Feb 16, 2012 (gmt 0)

If I use Apache Rewrite to remove my file extensions from my pages will Google view those pages as new? Do I have to also 301 redirect from the "page-name.htm" to "page-name"?

I'd like to change my pages from html to php to make my life a little easier but I'm worried about losing link juice if I were to 301 on every page.




 10:43 pm on Feb 16, 2012 (gmt 0)

Yes you need to redirect from old to new URL as well as rewrite from new URL to internal filepath. Both of those actions will use RewriteRules.

Do read the HTTP specs. Folders end with a slash. Pages do not end with a slash, but can have an optional extension. Don't change to a different extension in your URLs, remove the extension from your page URLs.

The files on the server need to have an extension in order for the server to know what to do with them. That extension does not need to appear in the URL used to access that resource.

See also: [webmasterworld.com...] from yesterday.


 12:59 am on Feb 17, 2012 (gmt 0)

As an added bonus, I would also include the canonical tag to help the SEs sort it out:
<link rel="canonical" href="http://www.example.com/yesthisisreallythepagename" >

FWIW, there's no reason to expose you're using PHP to the world, it's a security risk IMO

I would either go extensionless or leave it .html and .html doesn't mean it isn't .php behind the scenes or any other pre-processor for that matter, it just means you're sending an .html file as the end result.

Another thing to look out for is the "x-powered-by" header which PHP likes to include. I would zap that as well so crawlers looking for buggy versions of PHP that have vulnerabilities can't easily be found.


 1:08 am on Feb 17, 2012 (gmt 0)

In different words: A page's "real" name-- that is, the name it has on your server-- does not need to have anything to do with the URL that people use when going to the page.

In theory, you could make your pages php and perform some jiggery-pokery behind the scenes to call them html. Or vice versa. In theory, you could have all your pages masquerading as folders by putting a / at the end of the name. Or vice versa.

Do Not Do This! :) The point is simply that it can be done.

Changing from html to extensionless is a little more work than changing from html to php-- at least until next year, when you want to change from php to something else. That's when it starts paying off.


 1:31 am on Feb 17, 2012 (gmt 0)

I find extensionless much easier, and these two RegEx patterns are extremely useful:

^([^/.]+)$ and ^(([^/]+/)+[^/.]+)$

They totally eliminate the need for the very inefficient -f and -d "exists" checks so beloved of the many CMS designers who appear to have read only the first three or four pages of the Apache manual and none of the HTTP specifications. :)


 3:10 am on Feb 17, 2012 (gmt 0)

Thanks for all of the advice everyone!

What if I created new php pages with the same file name as the old .htm pages and set up apache to disguise them as .htm? Then I wouldn't have to do any redirects correct? I'd be fooling everyone then right!


 3:16 am on Feb 17, 2012 (gmt 0)

As much as I'd love to go extensionless I'd hate to have to do a bunch of 301's and loss some page cred.


 4:34 am on Feb 17, 2012 (gmt 0)

If you're phrasing that as a simple question of fact: yes, you can easily rewrite .html to .php. They don't have to have the same filename; in fact you might want to change them so you don't get the files mixed up. There's no difference between rewriting an extension and rewriting a whole URL, so long as it's in the same domain.


 10:36 am on Feb 17, 2012 (gmt 0)

Searchengines work with URLs out there on the web. They have no idea what the files on your server are called, nor where in the server filesystem they reside.

You can match any URL request to any file inside the server by altering the server configuration. So, yes, you can change the names of the files on the hard drive but still keep the same URLs for those pages.

On the pages of your site link to the URLs that you want people to "see" and "use". URLs are defined by what is in the href part of a link, not by what the files on the hard drive of the server are called.

You could have a site where .html URLs are served by files with the same name and same .html extension. At some time in the future you could change all the files on the server to have a .php extension and continue to link to and use .html URLs to access those pages. For this to work you have to set up an internal rewrite so that when a .html URL is requested the server actually fetches a .php file (but keeps it a secret that it has done so).

A lot of people make the mistake of changing the URLs from .html to .php and lose all their rankings and traffic for weeks or months. This happens because they effectively have a new site. Some soon realise their error and hurridly set up an external redirect so that people asking for the old .html URLs are redirected to the new .php URLs. This at least retains the traffic asking for the old .html URLs. However, there was no need for either of those steps. If they had used a rewrite this would allow the site to continue using the old .html URLs while having .php files inside the server filesystem serving the content.

I would recommend staying with the old .html URLs and using a rewrite to serve content from the new .php files. If you are contemplating changing the URLs from .html to .php then instead make your life much easier and go for extensionless URLs. This would require two steps: a redirect from .html URLs to extensionless URLs and a rewrite from extensionless URLs to .php internal files. Both of these things use a RewriteRule but the syntax is slightly different for each. To ensure that the .php files can't be accessed directly you would also block access to .php URLs or else redirect requests for .php URLs to the equivalent extensionless URLs.

As this topic comes up several times per month, there are very many previous threads with example code.


 5:55 pm on Feb 18, 2012 (gmt 0)

Thanks for the info! This has been a great help and given me some things to think about.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved