Forum Moderators: phranque

Message Too Old, No Replies

File URI rewrite - .htaccess rules suddenly causing server error

mod-rewrite, .htaccess, URL-rewrite

         

chankirtan

4:27 pm on Feb 25, 2014 (gmt 0)

10+ Year Member Top Contributors Of The Month



Please can anyone help to identify flaw or incompatibility in .htaccess code causing server error?

In the process of changing over from static to dynamic website, I had to maintain some static files with .html extension, but others were replaced with .php extension. Furthermore, I wanted visitors to be able to access any files by file name alone without the extensions.

OBJECTIVE: To cause the browser to return file-name.php, if it exists; else return file-name.html - whether the visitor has typed the url as any one of the following:

1. 'http://mydomain.com/file-name'
2. 'http://mydomain.com/file-name.html'
3. 'http://mydomain.com/file-name.php'

The following code - implemented some years ago, and I've lost track of the source - has worked perfectly all this while, until today.

# BACKWARD COMPATIBILITY RULESET
# FOR REWRITING FILE URI TO file.php IF EXISTS
Options Indexes +FollowSymLinks +MultiViews
Options +ExecCGI
RewriteEngine on
RewriteBase /
# parse out basename, but remember the fact
RewriteRule ^(.*).html$ $1 [C,E=WasHTML:yes]
# rewrite to document.php if exists
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.php [S=1]
# else reverse the previous basename cutout
RewriteCond %{ENV:WasHTML} ^yes$
RewriteRule ^(.*)$ $1.html


All of a sudden, this block of code is causing server error. Website is on shared hosting, and I do not have shell access. Server was recently upgraded, but right now my account is running on PHP 5.3.28, Apache 2.4.7. Even after the server maintenance last week, everything was still okay even up to yesterday. Only today it has ceased to work. I renamed the .htaccess file and created a fresh file, pasted in fresh code, but sure enough, this bit of code is causing the server to choke. Can anyone point out what might be the problem?

lucy24

12:57 pm on Feb 28, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The first has no space for a filename.

Whoops!

It may not be the only typo in this thread: SOMETHING caused it to come through in "Chinese Big5" encoding (which has the interesting side effect of causing anything in [code] tags to display as bold sans-serif instead of the usual plain Courier).

the wrong pipe character

Possibly related to file encoding, because I was pasting what I saw.

:: wandering off to find a moderator ::

g1smd

7:41 am on Mar 1, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A couple of typos have been fixed in post 4649849.

The additional unwanted slash has been removed from the first example.

The missing slash has been added to the second example.

phranque

9:46 am on Mar 3, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



[mod's note] - fixed some of the typos in the RewriteCond in chankirtan's previous post:
http://www.webmasterworld.com/apache/4648950.htm#msg4649928

Daitya

6:48 am on Mar 24, 2014 (gmt 0)

10+ Year Member



Hi, I'm back, though I've logged in this time with an older account under a different username. Sorry to take so long, but too many things going on.

I've implemented the rewrite to extensionless files and redirect to physical .php or .html files where existing. This is what I put in the .htaccess file:

# REDIRECT .PHP AND .HTML REQUESTS TO EXTENSIONLESS - see Rules 1, 2 and 3
# AND -F TEST TO MATCH REQUEST TO PHYSICAL .PHP AND .HTML
# 1 (per g1smd at Apache Web Server Forum, http://www.webmasterworld.com/apache/4648950.htm)
# PHP and HTML EXTENSIONLESS REDIRECT
RewriteCond %{THE_REQUEST} ^[A-Z]{3-9}\ /([^/]+/)*[^/.]+\.(php|html)
RewriteRule ^(([^/]+/)*[^/.]+)\.(php|html)$ http://example.com/$1 [R=301,L]

# 2
# Canonical Redirect
# Redirect from www.example.com to example.com
RewriteCond %{HTTP_HOST} !^(example\.com)$
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

# 3 (per lucy24 at Apache Web Server Forum, http://www.webmasterworld.com/apache/4648950.htm)
# SERVE CONTENT FROM .PHP OR .HTML FILES REQUESTED AS EXTENSIONLESS FILES
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^([^.]+)$ /$1.php [L]
RewriteCond ${REQUEST_FILENAME}\.html -f
RewriteRule ^([^.]+)$ /$1.html [L]


However, the old links for .html files that were upgraded to .php files now return 404. Whereas the previous chunk of code somehow redirected the request for foo.html to find foo.php when foo.html was not available. Is there a way to implement this without causing the duplicate content problem?

lucy24

9:11 am on Mar 24, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hey, I remember this thread. (I've tried to forget, har har.)
somehow redirected the request for foo.html to find foo.php when foo.html was not available

Redirected or rewrote? Your current batch of rules look as if you're redirecting everything to extensionless. So it shouldn't matter whether at some time in the historical past they were html or php. Is there more to the htaccess? Maybe some files that you're not redirecting? I know there's a reason this thread is in its second page.

The simplistic form looks like

:: racking brains ::

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^.]+)\.html$ /$1.php [L]


This form has some collateral damage in the form of bad requests being written to equally nonexistent files, like when the googlebot requests "ajkdljsgfk.html" to check for soft 404s. The user won't notice anything; only your server will.

Daitya

10:56 am on Mar 24, 2014 (gmt 0)

10+ Year Member



Yeah, Lucy, everything is redirected to extensionless. However, trying to match up a request for foo.html to foo.php is the problem. So you're suggesting to add this to what is already in place?

So I will end up with:

# REDIRECT .PHP AND .HTML REQUESTS TO EXTENSIONLESS - see Rules 1, 2 and 3
# AND -F TEST TO MATCH REQUEST TO PHYSICAL .PHP AND .HTML
# 1 (per g1smd at Apache Web Server Forum, http://www.webmasterworld.com/apache/4648950.htm)
# PHP and HTML EXTENSIONLESS REDIRECT
RewriteCond %{THE_REQUEST} ^[A-Z]{3-9}\ /([^/]+/)*[^/.]+\.(php|html)
RewriteRule ^(([^/]+/)*[^/.]+)\.(php|html)$ http://example.com/$1 [R=301,L]

# 2
# Canonical Redirect
# Redirect from www.example.com to example.com
RewriteCond %{HTTP_HOST} !^(example\.com)$
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

# 3 (per lucy24 at Apache Web Server Forum, http://www.webmasterworld.com/apache/4648950.htm)
# SERVE CONTENT FROM .PHP OR .HTML FILES REQUESTED AS EXTENSIONLESS FILES
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^([^.]+)$ /$1.php [L]
RewriteCond ${REQUEST_FILENAME}\.html -f
RewriteRule ^([^.]+)$ /$1.html [L]

# 4
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^.]+)\.html$ /$1.php [L]


Is this what you mean?

lucy24

3:46 pm on Mar 24, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well, it's what I meant theoretically-- but in your situation it seems utterly unnecessary. By the time a request reaches position #4, there will no longer be any (external) requests for .html, because they've already been redirected to extensionless. That's why I asked if there are other rules you're not showing us. How does a request for .html even get that far?

Are the 404 problems showing up consistently, not just in one browser or one testing environment?

Daitya

4:04 pm on Mar 24, 2014 (gmt 0)

10+ Year Member



I'm afraid so. Consistently.

http://example.com/quux/foo is returned as http://example.com/quux/foo
http://example.com/quux/foo.php is returned as http://example.com/quux/foo.php

And http://example.com/quux/foo.html is returned as 404 (when the html file is no longer existing, even though there is foo.php).

I'll put the .htaccess rules up on pastebin and post the link here. BRB.

lucy24

4:10 am on Mar 25, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



http://example.com/quux/foo is returned as http://example.com/quux/foo
http://example.com/quux/foo.php is returned as http://example.com/quux/foo.php

"is returned as" = shows the content of, or redirects to (change in address bar)?

Daitya

7:21 am on Mar 25, 2014 (gmt 0)

10+ Year Member



Both. The browser returns the same content for both requests, but the url shown in the browser reflects the url typed in the browser or link hit.

So if the link is http://example.com/quux/foo, the browser returns the content for foo, and displays the url http://example.com/quux/foo.

If the link is http://example.com/quux/foo.php, the brower returns the content for foo, and displays http://example.com/quux/foo.php.

Same thing, if the link calls for http://example.com/quux/foo.html, browser returns the content for that file (if it's there), and shows the foo.html in the url.

lucy24

8:21 pm on Mar 25, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Both.

There's no such thing as "returns the content of foo" since "foo" as such isn't a real, physical file. All your physical files are either .php or .html, right? Is there a third extension we haven't talked about yet?

I've downloaded the full file but haven't fine-tooth-combed it yet. As I suspected, there's more in your htaccess than has been discussed in this thread.

Daitya

12:32 am on Mar 26, 2014 (gmt 0)

10+ Year Member



No 3rd extension. : )
This 42 message thread spans 2 pages: 42