Forum Moderators: phranque

Message Too Old, No Replies

trying to redirect a php file without an extension

using .htaccess to remove the extension from a specific file

         

jendead

1:01 am on Feb 21, 2010 (gmt 0)

10+ Year Member



I apologize if this has already been asked and answered, but I am very much an .htaccess newbie and have tried to search. The problem is I'm not really sure what I'm trying to find, and info that I thought would help me just confused me more.

So, here goes...

I added "pretty links" to a website like so:
http://example.com/view.php became http://example.com/view
http://example.com/view.php?id=123 became http://example.com/view/123/title-of-post

I used these rules:
RewriteRule ^view$ view.php
RewriteRule ^view/([0-9]+)/[a-zA-Z0-9-]+$ view.php?id=$1 [QSA]

The problem is, view.php is still accessible and is skewing my Google Analytics results. I know I need to somehow redirect view.php to /view, but I get infinite loop errors when I try.

How do I fix this?

g1smd

1:54 am on Feb 21, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, you need to redirect the parameter URLs to the 'pretty' version. Use a RewriteCond to test that the requested value is one from a client and not from a previous internal rewrite. The redirect should contain protocol and domain name in the redirect target, and [R=301,L] flags.

List those redirects first.

List your canonical non-www to www redirect next.

List your internal rewrites last.

Make sure that EVERY rule (every redirect, every rewrite) has the [L] flag added to it.

Your A-Za-z can be sped up by using just A-Z and the [NC] flag.

There is a FATAL flaw with your rewrite. You are only checking the $1 value. You should also pass $2 to your script and the very first thing your script should do is check that the value of $2 is correct for the $1 page number that was requested.

That is, you have a page called /view/71236/this-great-widget and someone links to you with example.com/view/71236/dangerous-radioactive-stolen-widget-will-make-your-head-explode. Your site will happily serve the 71236 product page with '200 OK' status. Google will index the content at that URL as a duplicate, and it might just outrank the real URL for that page.

To fix it, once the $2 value is looked up in the database, compare what it should be with what was actually requested and if they do NOT match, issue a 301 redirect to the correct URL for the content.

It's less than 10 lines of code in your script but it closes a fatal hole that a competitor could use to put you OUT of business.

As for the whole redirect/rewrite process, I have seen it described and coded in full several times this month, so check recent threads as well as the forum charter for other examples.

jendead

7:08 pm on Feb 22, 2010 (gmt 0)

10+ Year Member



At risk of sounding like a total idiot, I'm still not getting it. I'm hoping it'll click eventually.

I don't understand how to use RewriteCond and everything I read just makes me even more confused. I've only ever used a copy/paste snippet to make sure the www version is always being served.

I'm not even sure how it works - is it like when you do RewriteCond, only the next line applies to it?

I tried poking around the .htaccess file but now I'm getting server misconfiguration errors. I need to be careful because Mondays are high traffic for the site so I can't go crazy with trial-and-error.

Thanks for the heads up on the $1/$2 issue. That's something I need to fix within my PHP script and not .htaccess, correct?

Here's my full .htaccess file, the forum chews up part of the code:
[pastebin.com...]

jendead

7:21 pm on Feb 22, 2010 (gmt 0)

10+ Year Member



I found a snippet in another thread that worked:
[webmasterworld.com...]

I put it in before the one that makes sure the URL has www in it.

So aside from the flaw g1smd mentioned, is the rest of the .htaccess file OK or is there anything else I need to fix?

(I addressed the a-zA-Z thing which isn't reflected in the pastebin)

g1smd

7:57 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your code:

Options +FollowSymLinks
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_HOST} ^example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteRule ^view$ view.php [L]
RewriteRule ^view/([0-9]+)/[a-zA-Z0-9-]+$ view.php?id=$1 [QSA, L]
RewriteRule ^character$ char.php [L]
RewriteRule ^character/([0-9]+)/[a-zA-Z0-9-]+$ char.php?id=$1 [QSA, L]
Redirect 301 /othersite/ http://www.othersite.com/ [L]


<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
Header set Expires "Wed, 12 Dec 2012 12:12:12 GMT"
</FilesMatch>

g1smd

8:14 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Paste the code here, and make sure all rules use RewriteRule.
Don't use Redirect, recode it using RewriteRule and move it to the very beginning.


A RewriteCond applies only to the next RewriteRule.


Put a blank line after each RewriteRule to make the code easier to read; add a comment before each code block explaining what the next few lines do.


Change the non-www RewriteCond from this:
RewriteCond %{HTTP_HOST} ^example\.com$

to this:
RewriteCond %{HTTP_HOST} [b]![/b]^[b]www\.[/b]example\.com$



This rewrite is dangerous:
RewriteRule ^character/([0-9]+)/[a-zA-Z0-9-]+$ char.php?id=$1 [QSA, L]

You need to pass $2 to your script and the script needs to check the value.
RewriteRule ^character/([0-9]+)/[b]([/b][a-z0-9-]+[b])[/b]$ [b]/[/b]char.php?id=$1[b]&param=$2[/b] [QSA,L]


If you do not, you have an infinite duplicate content problem.

Also, I'd try to ensure all URLs use only all lower-case lettering.

jendead

9:08 pm on Feb 22, 2010 (gmt 0)

10+ Year Member



Ok, I already have a sanitizing script in place that removes all punctuation and forces lowercase when generating links so that should be good. Here's what my .htaccess looks like now:

Options +FollowSymLinks
RewriteEngine on
RewriteBase /

# make sure we're on the right site
RewriteRule ^othersite http://www.othersite.com/ [R=301,L]

# kills .php extension
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*[^.]+\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.php$ http://www.example.com/$1 [R=301,L]

# makes sure www is before the domain name
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# makes view and character go to the proper scripts
RewriteRule ^view$ view.php [L]
RewriteRule ^character$ char.php [L]

# makes pretty links work
RewriteRule ^view/([0-9]+)/[a-z0-9-]+$ view.php?id=$1&t=$2 [L, QSA]
RewriteRule ^character/([0-9]+)/[a-z0-9-]+$ char.php?id=$1&t=$2 [L, QSA]

# these files expire when the world ends!
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
Header set Expires "Wed, 12 Dec 2012 12:12:12 GMT"
</FilesMatch>


So hopefully I put everything in the right order this time? Thank you so much for your help so far. :)

g1smd

9:14 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



# makes pretty links work

You'll need parentheses in both patterns in order to store the $2 backreference.

At present $2 will always be blank.

jendead

9:18 pm on Feb 22, 2010 (gmt 0)

10+ Year Member



Like this?
RewriteRule ^view/([0-9]+)/([a-z0-9-]+)$ view.php?id=$1&t=$2 [L, QSA]

jendead

9:22 pm on Feb 22, 2010 (gmt 0)

10+ Year Member



Ok, here's the revised one, it's currently causing a 500 Internal Server Error... I don't know why.

Options +FollowSymLinks
RewriteEngine on
RewriteBase /

# make sure we're on the right site
RewriteRule ^myothersite http://www.myothersite.com/ [R=301,L]

# kills .php extension
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*[^.]+\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\.php$ http://www.example.com/$1 [R=301,L]

# makes sure www is before the domain name
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# makes view and character go to the proper scripts
RewriteRule ^view$ view.php [L]
RewriteRule ^character$ char.php [L]

# makes pretty links work
RewriteRule ^view/([0-9]+)/([a-z0-9-]+)$ view.php?id=$1&t=$2 [L, QSA]
RewriteRule ^character/([0-9]+)/([a-z0-9-]+)$ char.php?id=$1&t=$2 [L, QSA]

# these files expire when the world ends!
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
Header set Expires "Wed, 12 Dec 2012 12:12:12 GMT"
</FilesMatch>

jdMorgan

3:05 am on Feb 23, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> it's currently causing a 500 Internal Server Error... I don't know why.

Your server will tell you why, if you look at the server error log file...

Possibly invalid syntax in [L, QSA] ... Mod_rewrite coding is not free-form, and makes no allowance for 'style'; We all have to type it exactly as shown in the Apache documentation.

Delete the space, and just use [QSA,L] (two instances in your code).

I'd suggest you replace that mod_headers section at the end with mod_expires, and then use the "Expires <time in seconds> after last access" notation to expire your images after a month. There's little to no benefit to trying to get clients to cache things for longer than that, your css and js file *might* change -- at which point your "2012" date becomes a major problem, and that fixed 2012 date in itself is a time-bomb: If you forget to update it in two years, all of your expires times will suddenly be in the past, and your server load will skyrocket, with clients requesting updated versions of those files with every page request...

Jim

jendead

4:14 am on Feb 23, 2010 (gmt 0)

10+ Year Member



Hmm... my log file is about 40 megs worth of this:
[22-Feb-2010 22:37:37] PHP Warning: PHP Startup: Unable to load dynamic library '/usr/local/lib/php/extensions/no-debug-non-zts-20060613/htscanner.so' - /usr/local/lib/php/extensions/no-debug-non-zts-20060613/htscanner.so: cannot open shared object file: No such file or directory in Unknown on line 0


Am I looking at the right log?

The thing that kills the PHP extension is breaking other parts of the site. I think maybe I should just have two rules that only apply to view.php and char.php. I am pretty sure I will do it wrong but would it be something like this (and just repeat for char.php)?

RewriteCond %{THE_REQUEST} !^http:\/\/www.example.com\/view\.php\?id=$1&title=$2\HTTP/
RewriteRule ^view.php$ http://www.example.com/view$1 [QSA,L]


I tried replacing that line with "Header set Expires 604800 after last access" but that threw another 500. I just removed the expires stuff completely so the rest of it would work.

And by the way, fixing [QSA,L] worked so thank you for that!

PS. I'd just like to say that I am very pleasantly surprised by this forum and how patient you guys are with people like me. Hopefully someday I will understand enough that I can help other people who are as confused as I am :)

jdMorgan

2:41 pm on Feb 23, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Use the power of regular expressions. Just one rule will do:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(view|char)\.php(\?[^\ ]*)?\ HTTP/
RewriteRule ^(view|char)\.php$ http://www.example.com/$1 [R=301,L]

An alternative is to exclude the troublesome directory- and file-paths from this redirect. Base your choice on whether it's easier maintenance-wise to include URLs that should be redirected, or to exclude URLs that should not be redirected.

Character-escaping rules in .htaccess differ from those in PERL, PHP, etc. Unless you want your site off-line -- or worse, subtle errors that cause it to fail intermittently and/or destroy your search rankings-- refer to the mod_rewrite documentation at apache.org and make sure you've got your syntax exactly right! Chances of success using only copy-and-paste plus "guessing" are essentially zero.

mod_expires:

ExpiresActive On
#
# Images, media, video - No cache revalidation, expire after 30 days
# (Filetypes listed in order of frequency of access based on stats)
<FilesMatch "\.(gif|jpg|ico|png|jpeg?|pdf|xls|avi|flv|wmv|swf|mov|smi)$">
Header unset Cache-Control:
ExpiresDefault A1296000
</FilesMatch>

If mod_expires is available, that should work. But again, see the mod_expires, mod_headers, and core-directives documentation at apache.org. This is server configuration code, and any tiny error may be disastrous...

You should also have a word with your host about those "missing module" errors in your log file. Your server error log file should be empty, except for entries created due to your intentional blocking of accesses by hotlinkers, bad-bots, and unwelcome IP address ranges. Having all those missing-module errors cluttering things up is unacceptable; Either the module should be installed and enabled, or all of the references to it should be removed so that these errors won't be triggered.

Jim

jendead

7:58 pm on Feb 24, 2010 (gmt 0)

10+ Year Member



Thanks so much - you guys have been incredibly helpful. I'm going to do more research before I try the expiration thing. I've e-mailed my host about the message but they aren't responding... trying again.