Forum Moderators: phranque
.htaccess:
RewriteEngine on
RewriteBase /
RewriteRule ^/?test\.php/([a-zA-Z_]+)$ test?city=$1 [L]
test.php:
<?php
echo "<p>City = {$_GET['city']}</p>";
echo $_SERVER['REQUEST_URI'];
?>
uri input = uri output:
http://www.example.com/test.php/phoenix
browser output:
City = phoenix
/test.php/phoenix
The above works as expected. Now add content negotiation (line 2 of .htccess) and change the script relative to extensionless urls, and things stop working:
.htaccess:
RewriteEngine on
Options +multiviews
RewriteBase /
RewriteRule ^/?test/([a-zA-Z_]+)$ test?city=$1 [L]
test.php:
<?php
echo "<p>City = {$_GET['city']}</p>";
echo $_SERVER['REQUEST_URI'];
?>
uri input = uri output:
http://www.example.com/test/phoenix
browser output:
City =
/test/phoenix
I'm not sure why this is. The thing that baffles me is that I'm returning the uri_request in both cases, and they are just what I expect: one with an extension and the other without. Really this file is just a test to move on to bigger and better things. To get "my" http -> https .htaccess file working without extensions (it currently doesn't). See this post for jdMorgan's excellent advice and thorough details:
And as a side note, jdMorgan mentions a different approach to extensionless urls which is here:
jdMorgan, what be thine take on content negotiation/multiviews? Is there a reason you don't suggest it in the above post? (Perhaps you don't because of the problems I'm having now!)
Looking forward to your advice as you are perhaps the most well known mod_rewrite ninja on this site!
Questions:
Why do you need MultiViews?
If you really need them, have you looked at the content-negotiation configuration to be sure you've excluded URLs that must/should be handled by mod_rewrite? (This would have to be done using a <Directory> container in httpd.conf or conf.d to limit Options MultiViews to certain directories only.)
Have you considered the feasibility of replacing MultiViews' function by more-selective mod_rewrites, or by handling such functions in your script(s)? (The "file exists" checking available using mod_rewrite's RewriteCond directive can be particularly handy for this.)
I personally don't like using MultiViews because it can create duplicate-content problems (the same content accessible via multiple URLs), thus posing a potentially-big problem for search optimization/ranking.
Jim
The same is true for geo-location; Consider again these ethnically-diverse regions: Picking a language based on geo-location is also a flawed strategy, even in Europe! -- Maybe the Swiss user wants French, or Dutch, or German...
The only way to reliably determine user preferences is to explicitly ask the user --within the current session-- to select a language and (possibly) other preferences; That's why many multi-language sites have all those little national flags all over the front page... :)
Jim
Hearing about the possible search engine ding... well that is HUGE, and has me running back to .php extensions. I'll be converting my site now (totally serious).
Now I have EXTREME interest in the mod_rewrite and file_exists option you spoke of. I would still like to have clean urls. And it sounds like I can. Is this what the rewrite would look like?
RewriteRule ^/?([a-z]+)$ http://www.example.com/$1.php [L]
Perhaps you have covered a mod_rewrite/file_exits method in another post. I'd love to know your methodology as I want to be using best practices.
Thanks so much jdMorgan!
PS: Wait, I just realized I don't have to convert my site's current linking structure at all (which have no .php extensions). The rule automatically has them reference the real file. Freaking cool!
RewriteRule ^/?([a-z]+)$ http://www.example.com/$1.php [L]
However, I'm trying to marry that rule with the http to https rules that "we" wrote before ([webmasterworld.com ]), and thinks aren't working (I modified the possible pages to not have extensions, (.*), and [R=301,L]):
RewriteCond %{REQUEST_URI} !\.(css¦jpeg¦js¦gif¦png)$ [NC]
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{REQUEST_URI} ^/?(secure¦owner)$
RewriteRule ^/?([a-z]+)$ https://www.example.com/$1.php [L]RewriteCond %{REQUEST_URI} !\.(css¦jpeg¦js¦gif¦png)$ [NC]
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{REQUEST_URI} !^/?(secure¦owner)$
RewriteRule ^/?([a-z]+)$ http://www.example.com/$1.php [L]
When I request http://www.example.com/secure, the browser loads the https version, but also tacks on the .php extension! (https://www.example.com/secure.php) This makes NO sense to me, because this is the EXACT standalone rule I was using to make invisible extensions.
Here is my understanding: For the RewriteRule, it is capturing the REQUEST_URI (not from the previous step, just from the actual request), which in my case is "secure" (with no leading slash and NO extension)... it should then shove that baby into the end of where it is actually pointing. Thus, pointing to http://www.example.com/secure.php [while at the same time not showing it]. But it does show the extension.
There must be an interplay between the condition and the rewrite rule. Is the conditional just b4 the rule actually giving a variable to the rule?! I thought that's what parentheses did. I mean, I thought you have to use the var on the same line, and that for conditions, you ref by % and for rules you ref by $.
I do realize there is a screwyness that jdMorgan clued me on b4... that leading forward slashes are needed in conditions, but not in rules (I think this depends on your version of apache, but regardless the two act opposite). If you can't tell, I'm thoroughly confused. But confusion comes before the storm. Wait... :o) ... b4 the storm blows over. I'm not sure what I'm saying. I can't believe how HARD mod_rewrite is! I wish you could throw in echos or something!
One extra thing to think about, is that having / and /index is another Duplicate Content issue. One of those should issue a 301 redirect or a 404 error, and the other should issue the content along with "200 OK".
# Externally redirect HTTP requests for "/secure" and "/owner" to HTTPS (SSL),
# except for css, jpg, jpeg, js, gif, and png files
RewriteCond %{REQUEST_URI} !\.(css¦jpe?g¦js¦gif¦png)$ [NC]
RewriteCond %{REQUEST_URI} ^/(secure¦owner)
RewriteCond %{SERVER_PORT} !^443$
RewriteRule (.*) https://www.example.com/$1 [R=301,L]
#
# Externally redirect HTTPS requests for all except "/secure" and "/owner" to HTTP,
# except for css, jpg, jpeg, js, gif, and png files
RewriteCond %{REQUEST_URI} !\.(css¦jpe?g¦js¦gif¦png)$ [NC]
RewriteCond %{REQUEST_URI} !^/(secure¦owner)
RewriteCond %{SERVER_PORT} ^443$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#
# Simple rule: Internally rewrite all extensionless URLs to .php
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.php [L]
#
#
# More-complex-rules: Check for existing files with .php, .html, .htm, .shtml, and .shtm extensions
#
# If extensionless URL resolves to existing .php file, rewrite to .php
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.php [L]
#
# Else if extensionless URL resolves to existing .html file, rewrite to .html
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.html [L]
#
# Else if extensionless URL resolves to existing .htm file, rewrite to .htm
RewriteCond %{REQUEST_FILENAME}.htm -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.htm [L]
#
# Else if extensionless URL resolves to existing .shtml file, rewrite to .shtml
RewriteCond %{REQUEST_FILENAME}.shtml -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.shtml [L]
#
# Else if extensionless URL resolves to existing .shtm file, rewrite to .shtm
RewriteCond %{REQUEST_FILENAME}.shtm -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.shtm [L]
BTW, it's really not a bad idea to leave comments in your code -- at least until you're finished working on a site.
Jim
Admittedly I haven't gone through your more complex "extensionless" rewrite... but I'm soon going to do as you suggested to another member: print it out, and highlight only the things I understand, never moving through unless I understood the beginning. And I generally use comments... I'll start posting them.
Couple question:
1) Was there a reason you moved the port rule down one? Performance?
2) As a rule, redirects BEFORE rewrites. Correct?
Also, regarding what g1smd wrote: great suggestion! I did a quick search and tried to implement [webmasterworld.com ]
...but it wasn't the quick fix I thought it may be (though it seems to be a replica of my desired outcome). When I type in http://www.example.com/index.php OR /index, it just sticks, and doesn't redirect to the root.
Perhaps I put the new rules in the wrong place (before) my http -> https rules. I'm struggling a little knowing the "stacking" order to place things in.
PS: Also looking for a way to convert example.com to www.example.com
I don't expect you guys to continue pumping out solutions for me, but I certainly won't hold you back! I'm definitely learning from your posts, and am super-appreciative of your effort.