Forum Moderators: phranque
I found these methods discussed in an old (closed) thread here, and would appreciate it if someone could explain the differences and advantages of the two redirect methods:
mod_alias:
redirectmatch 301 ^/(.*)\.html$ [yourdomain.com...]
mod_rewrite:
RewriteEngine On
RewriteBase /
RewriteRule ^/(.*)\.html$ /$1.php
Also, I have a few questions:
1). Do these methods redirect ALL html files, in all directories for the domain (public_html and all its subdirs) from one .htaccess in the main dir?
2). Do these methods only redirect for html files that have the a php file by the same name (e.g. foobar.html will redirect to foobar.php)?
3). If there is not a php file for by the same name as the html file, will these methods still allow the html to be loaded, or must there be a php file to replace the html?
I have a site that I may want to use some sort of redirect for, but I do not plan to change all the files from html to php - just the main website structure (index.html to index.php for each directory, but leaving the thousands of html content pages alone).
I do not want to parse html files as php files -- this won't help me.
I found this other method of redirect too:
Redirect 301 /index.html
[afakedomain.com...]
Redirect 301 /news.html
htt*p://ww*w.afakedomain.com/news.php
Is that a better way than the other two methods above?
I really need to resolve this. I've been going crazy with this.
Please help a frustrated chick out...
Thanks.
Raeba
The first method (mod_alias) is an external redirect of any URL-path that ends in .html to the same-named resource of type .php.
The second method (mod_rewrite) is an internal rewrite of any URL-path that ends in .html to the same-named resource of type .php. As such, it does not update the client browser's address bar, nor is it "visible" to search engines.
1) Yes.
2) No. The code, as posted, does not check for "file exists."
3) No html loaded, php file must exist. Again, as the code is posted above.
These are only a few samples of rewrites and redirects, and don't demonstrate more than a tenth of one percent of what *can* be done. It would be most productive if you would define what you need to do, and then research and ask about methods that will address your specific goals.
As I hinted at above in #2, the mod_rewrite code can be modified to check for file exists. This is done using the RewriteCond directive with %{REQUEST_FILENAME} and checking for file_exists. See the mod_rewrite documentation for details.
Our Charter [webmasterworld.com] includes links to several other resources you may find helpful as well.
Jim
Ok, here (per your wise recommendation) is my situation and what I need to accomplish.
I have an html site. There ia a main structure as with most websites -- where index.html files exists in directories (e.g. /index.html /contact/index.html /news/index.html /links/index.html et cetera).
For those main pages, I have created better pages with php extensions.
Now, there are hundreds of other *.html files (those that do not begin with 'index' - not main dir files). These, as well as the main files, are well-indexed in the search engines.
I do not want to jeopardize the rank in the search engines, as much as possible (though not quite controllable I understand...).
So, my idea was to redirect the index.html files from the main structure directories to the new index.php files in those same main directories.
The other html files would remain html files, though I would edit them to change links within them to point to the new index.php files, rather than the old index.html files.
I would also go into the server's httpd.conf file and make sure that index.php is called first (for DirectoryIndex), then index.htm index.html...
So, I guess I need to know what is the best way to redirect the explicit calls for the index.html files that come from the search engines as well as any directories or user bookmarks.
Does this sound like a good way to do this? [I don't want to parse html as php...]
What code would I have to put in the /public_html/.htaccess file to accomplish this -- keeping what pleases the search engines (e.g. Google) in mind?
Thanks for helping me with this. I've read dozens of posts around the net, and tried to make sense of documentation that is a bit confusing for me. The one reply that you (jdMorgan) have given me here has already helped more than anything else. I hope my clarification here is indeed clear, and that you can help me resolve this.
Big hugs,
Raeba
If you put this rule in your root .htaccess file:
RewriteRule ^(.+)\.html$ /$1.php [L]
then all requests for html files will be served php files instead, and yet it will be impossible to know this from the outside, as long as you get the MIME-type correct and the php scripts don't fail and output error messages that give it away. In short, no-one would see any change at all to your site, even though all your html pages had been replaced by PHP. So there is *no* impact on search engines if this is properly done.
I understand that you don't want to replace *all* files; the preceding was meant to emphasize what mod_rewrite is good at... changing the relationship between URLs and filenames in a transparent manner. I'll try to post some examples later (much later). In the meantime, take a look throuugh the mod_rewrite documentation and the Rewriting Guide, and look at the RewriteCond directive and the file-testing flags that it supports (in the context of my previous hint about RewriteCond). That will take you quite a way towards your goal.
Jim
I did some prior research, thus my bringing up the code to begin with. I also took your suggestion to post the specifics regarding what I need to accomplish. [Which was for naught. Why would I go through that if I need to find out elsewhere?]
I understand that the 'charter' here is not in favor of doing all the work for someone.
However, I have posted the problem at length, and have obviously done some research.
<snip>
I suspect that it will take something like the following code:
Redirect 301 /index.html
htt*p://www.foobar.com/index.php
Redirect 301 /news.html
htt*p://ww*w.foobar.com/news.php
<snip>
Warm wishes and good luck to all,
Raeba
<snip>
[edited by: jdMorgan at 8:51 pm (utc) on Feb. 18, 2005]
[edit reason] Removed moderation comments per TOS #24.. [/edit]
This is an example of the rewrite rules I use on my site. Many of my html pages have been re-designed as php pages, but it is an established site and changing pages would seriously disrupt our business. So I simply rewrite the pages in htaccess with this:
RewriteRule ^index.html$ [mydomain.com...] [L]
Notice that I've placed my php documents in a php directory on my site, and that directory is banned (in theory) in my robots.txt file. I've used this rule for every page that I've changed, with no detrimental effects to my listings. One thing I avoid, and this is personal, is using wildcards and regex expressions. Why? Because I tend to get them wrong - usually - and do more damage than good. So, one file, one RewriteRule. It works very nice, and there is absolutely no slowed response - at least from the Apache side of things. Additionally, my site visitors, and bots, all think they are being served html pages - that's what shows up in the browser bar anyway. A look at my source would indicate otherwise, simply because I include the source name in my meta tags. It's how I make sure the correct source gets edited if needs be.
Now don't get me wrong, but if were to say to you to implement some specific line in your htaccess, it may or may not work, depending on factors that I know nothing about. So in that sense you may have difficulties getting an exact answer to some questions. Deferring to the documentation, with some background reference, is just common sense.
You mentioned the question (I think) in another thread - with regards to your html files after you have applied RewriteRules. You can, in theory, ditch them. I don't, simply because my htaccess could suddenly go south (Murphy's Law). In that case, I'd still have at a minimum, my index page (html version) displayed. Again, what works for me and my peace of mind may not suit you and your setup.
HTH
Because the URL-path is localized to a directory in .htaccess per-directory context, the code can be a bit simpler if installed in httpd.conf. This is because the leading slash is stripped from URLs seen by RewriteRule in .htaccess, which can make the pattern-match on "index.html" or "index.php" in any subdirectory somewhat ambiguous. So, I'll show both examples.
This code also handles requests for "example.com/subdirectory/" where the index file of that subdirectory is desired, but not explicitly requested.
In httpd.conf:
# First, rewrite any requests for "/" in any subdirectory to index.html
RewriteRule ^(.*)/$ $1/index.html
# Then rewrite requests for /index.html in any subdirectory to index.php
RewriteRule ^(.*)/index\.html$ $1/index.php
# Finally, if the php file does not exist, then change the request back to index.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)/index\.php$ $1/index.html [L]
In .htaccess:
# First, rewrite any requests for "/" in any subdirectory to index.html
RewriteCond %{REQUEST_URI} ^(.*)/$
RewriteRule .* %1/index.html
# Then rewrite requests for /index.html in any subdirectory to index.php
RewriteCond %{REQUEST_URI} ^(.*)/index\.html$
RewriteRule index\.html$ %1/index.php
# Finally, if the php file does not exist, then change request back to index.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} ^(.*)/index\.php$
RewriteRule index\.php$ %1/index.html [L]
Jim
Thank you very much for the thoughtful response. I want the browser to show the actual index.php file. I don't want it to appear that the file is an index.html file.
I want to redirect the existing index.html traffic to the actual index.php file, in each subdir. I will also have html files remaining (non-index files) that I want to remain as html (and not be redirected).
Warm Regards,
Raeba
Thanks for the very helpful response. I especially like the httpd.conf way that you have presented.
You mentioned that the mod_alias method could not be used if the replacement file exists.
I plan to have a index.php for each of the subdirs. So, I'm just curious, would the following mod_alias (below) work in that case?
Redirect 301 /index.html
[afakedomain.com...]
Redirect 301 /news.html
htt*p://ww*w.afakedomain.com/news.php
I'm just wondering about the search engines take on all of this. I've read that it prefers 301 redirects... What's your take on this?
I really love the httpd.conf code (below) that you've shared. I especially like that it takes into account whether or not the file exists (though I will have an index.php in all dirs anyway).
In httpd.conf:
# First, rewrite any requests for "/" in any subdirectory to index.html
RewriteRule ^(.*)/$ $1/index.html
# Then rewrite requests for /index.html in any subdirectory to index.php
RewriteRule ^(.*)/index\.html$ $1/index.php
# Finally, if the php file does not exist, then change the request back to index.html
RewriteCond %{REQUEST_FILENAME}!-f
RewriteRule ^(.*)/index\.php$ $1/index.html [L]
How do the search engines deal with this? Does that httpd.conf code (above) have the same affect as doing a 301 redirect (to keep Google, et al, happy)?
Thanks.
Warm Regards,
Raeba
I want the browser to show the actual index.php file. I don't want it to appear that the file is an index.html file.OK, but that will affect your SERP's for the index page. Basically, your index page will have to start all over again. Use the example I showed you, but add a 301 response code to it. The 301 will tell browsers and bots that your page has permanently moved -> Over Here.
I want to redirect the existing index.html traffic to the actual index.php file, in each subdir.This could be more complex than I'm reading it. But if you have an index.html in subdir search, then in that subdir you would have an htaccess file.
RewriteEngine on
RewriteBase /search
http://search.mydomain.com/index.html would be redirected as
RewriteRule ^index.html$ http://search.mydomain.com/index.php [R=301, L]
The difference is the use of RewriteBase.
I will also have html files remaining that I want to remain as html (and not be redirected).No need to do anything with those, if I'm understanding you correctly.
Don't you think the httpd.conf method posted by jdMorgan as follows will do the trick?
In httpd.conf:
# First, rewrite any requests for "/" in any subdirectory to index.html
RewriteRule ^(.*)/$ $1/index.html
# Then rewrite requests for /index.html in any subdirectory to index.php
RewriteRule ^(.*)/index\.html$ $1/index.php
# Finally, if the php file does not exist, then change the request back to index.html
RewriteCond %{REQUEST_FILENAME}!-f
RewriteRule ^(.*)/index\.php$ $1/index.html [L]
I generally prefer doing things in the httpd.conf file, as the .htaccess seems to be the alternative used by those who do not have access to the httpd.conf. Though, I am eager to use whatever will meet my goals as follows:
- Make the search engines happy -- retaining validity for the links they have for my index.html files.
- Keep both index.html and index.php files in my directories -- though redirecting (SEO friendly) index.html calls to index.php.
- Leaving the rest of my *.html files in the directories as they are (because there will not be any *.php files to replace them). [eg. news.html will remain, and never have a news.php to replace it.]
I really like the httpd.conf code that jdMorgan provided in his previous post, but am not sure if it is SEO-friendly, and I'm waiting to hear about that. [?]
I'd also like to know if I should change the DirectoryIndex line in the httpd.conf file to order the index.php first (before index.html) [?]
Thanks for helping me through this. I hope I am near a resolution.
Raeba
It is absolutely the most SEO-friendly solution, because it requires no changes to your URLs, and no other changes whatsoever to your site. The SE's won't know that anything has changed, except that they will see your new php-generated page content when requesting index.html.
If you do a redirect instead of the proposed rewrite, then as Grandpa says, you may temporarily lose the rankings of the old html pages, and you risk restarting the clock on SE's indexing all those pages and updating their backlinks. Unless you are dead-set on showing your .php file extensions, there is no reason to do a redirect.
Let's define those terms:
A redirect terminates the current client HTTP request by responding with a 301 or 302 response code accompanied by a new URL. The client must issue a new HTTP request using that given URL in order to get the originally-requested resource. This causes the new URL to show up in the browser address bar, and informs search engine spiders that the resource has moved and that their URL database needs to be updated.
A rewrite simply changes the filename on the server that is associated with a requested URL. It is totally 'silent' in its operation as viewed from outside the server.
The way the rewrite code works is that the client browser or SE asks for index.html, and you serve them the contents produced by index.php, and none the wiser. As stated above, there is no required relationship between a requested URL and the filepath of the resource that provides contents for that URL. This technique is used to good effect on script-based sites (such as WebmasterWorld) and others to improve search access to pages, because they appear to be static.
Jim
It is absolutely the most SEO-friendly solution, because it requires no changes to your URLs, and no other changes whatsoever to your site. The SE's won't know that anything has changed, except that they will see your new php-generated page content when requesting index.html.
So, does that mean that an index.html filename will still show in the browser (rather than my new index.php)?
If so, would it only show the index.html filename when the index.html is explicitly called? If so, then does that mean that the index.php will show when it is explicitly called or when a plain dir is called (with no filename given by browser) -- or will the index.html always show no matter what?.
Oh, and I just thought of something very important! There are more domains on this server than the one that I want to do this for. Will this code break anything for the other domains? I suspect only if a index.html and a index.php reside in the same dir, correct?
If you do a redirect instead of the proposed rewrite, then as Grandpa says, you may temporarily lose the rankings of the old html pages, and you risk restarting the clock on SE's indexing all those pages and updating their backlinks. Unless you are dead-set on showing your .php file extensions, there is no reason to do a redirect.
Yeah, I don't want to mess up the ranking, at least as much as possible.
I did plan, however, to eventually eliminate the index.html files for the dirs, so I thought that I would have to do a redirect.
Thanks.
Raeba