Forum Moderators: phranque

Message Too Old, No Replies

Rewrite rule with url decoding

         

drewjuk

9:52 am on Jan 26, 2012 (gmt 0)

10+ Year Member



Hi,

I have been trying to learn how to setup rewrite rules in .htaccess, I can't seem to get it to work when part of a string is url encoded, can any body help please?


Options -Indexes
Options +FollowSymLinks

#Rewrite engine
RewriteEngine on
RewriteRule ^product/(\w+)/([0-9]+)$ /index.php?product=$1&pid=$2&sp=1


where the product=$1 the $1 string could any type letters or numbers and it is html encoded, any ideas how to get it read this?

Thanks for your help

drewjuk

11:30 am on Jan 31, 2012 (gmt 0)

10+ Year Member



I cannot seem to get this to work no matter what I try:

ErrorDocument 404 index.php?pg=error&type=$404 [L]

This is what it should be:
ErrorDocument 404 errors/404 [L]

Even with all other rules commented out I still get 500 erro!

drewjuk

3:17 pm on Jan 31, 2012 (gmt 0)

10+ Year Member



I am not sure what happened, but when I uploaded this it crash my server or it was just a coincidence, I have no problem locally with the .htaccess any ideas?

Options -Indexes

Options +FollowSymLinks



#Rewrite engine

RewriteEngine on



#Main Pages

RewriteRule ^page/([-0-9a-zA-Z_]+)/([0-9]+)$ Pages/$1.php?pid=$2 [L]

RewriteRule ^content/([-0-9a-zA-Z_]+)$ index.php?pg=$1 [L]

RewriteRule ^content/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?pg=$1&toe=$2 [L]

RewriteRule ^content/contact_gbforklifts/([0-9]+)/([0-9]+)/([0-9]+)$ index.php?pg=contact_gbforklifts&toe=$2&pid=$1&buy=$3 [L]



#News Articles

RewriteRule ^news/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?pg=news&article=$1&nid=$2 [L]



#Shop

RewriteRule ^categories/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?cat=$1&cid=$2 [L]

RewriteRule ^categories/([-0-9a-zA-Z_]+)/([0-9]+)/([0-9]+)$ index.php?subcat=$1&scl=$2&cid=$3&child=2 [L]

RewriteRule ^categories/subcat/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?subcat=$1&cid=$2&spl=1 [L]

RewriteRule ^products/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?product=$1&pid=$2&sp=1 [L]

#End



#BLOCK ANY PHP REQUESTS

#RewriteRule ^(.*)\.php errors/404 [L]

#End



#Create 404 Error Page

RewriteRule ^errors/([0-9]+)$ index.php?pg=error&type=$1 [L]

#End



#Forward 404 Errors to Error Page

#ErrorDocument 404 index.php?pg=error&type=$404 [L]

#End



#Forward sitemap

RewriteRule sitemap\.xml site-map.php [L]

#End



#REDIRECT ANY PHP REQUEST TO HTM

RewriteRule ^([-0-9a-zA-Z_]+)\.htm$ $1.php [nc] [L]

#End

lucy24

9:30 pm on Jan 31, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Homing in on the most glaring problem first...
#REDIRECT ANY PHP REQUEST TO HTM

RewriteRule ^([-0-9a-zA-Z_]+)\.htm$ $1.php [nc] [L]

This rule does not redirect php to htm. It REWRITES htm to php. This happens to be what you want. But your notation implies that you're unclear on two things:

-- the difference between a Rewrite and a Redirect
-- the direction of the change

#BLOCK ANY PHP REQUESTS

#RewriteRule ^(.*)\.php errors/404 [L]

Yes, you had better leave that out.
#1 intercepting all php requests will prevent your site from functioning, because the Rule is not constrained to THE_REQUEST (requests coming in from "outside")
#2 you are rewriting to a file that doesn't exist ("real" files must end in either slash for a directory or an extension for a file)
#3 never rewrite directly to an Error Document. Serve up the error and let the document serve itself.
#4 never use .* anywhere but the end of a pattern
#5 "block" is 403, flag [F]. 404 is "file not found". A 404 does not belong in your htaccess at all; it happens by itself. Part of the purpose of an htaccess is to prevent 404s by dealing with requests in other ways such as [G] or redirection.

How did we get from URL decoding (thread title) to htm/php rewrites?

g1smd

9:48 pm on Jan 31, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



[nc] [L] is not valid. Use [NC,L] instead.


ErrorDocument 404 index.php?pg=error&type=$404 [L]

Do not add [L] to ErrorDocument directives.

A 404 error will show the content at
/index.php?pg=error&type=04


Notice the
type=04
part. This occurs because the $4 backreference you accidentally used in the rewrite target will be blank.


ErrorDocument 404 errors/404 [L]

Do not add [L] to ErrorDocument directives.

The above rule will attempt to show content from a file called "404" (no extension) in a folder "errors" when there is a 404 error.

drewjuk

9:00 am on Feb 1, 2012 (gmt 0)

10+ Year Member



Hi,

Right ok thanks for that info, that helped a a lot now everything is working as it should except the error document 404 that just shows the url it is supposed to go to, I have used this elsewhere and not had a problem with it, if I put the / for base url it takes me to apache home on my local xampp setup so it is sort of working.

Options -Indexes
Options +FollowSymLinks

#Rewrite engine
RewriteEngine on

#Main Pages
RewriteRule ^page/([-0-9a-zA-Z_]+)/([0-9]+)$ Pages/$1.php?pid=$2 [L]
RewriteRule ^content/([-0-9a-zA-Z_]+)$ index.php?pg=$1 [L]
RewriteRule ^content/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?pg=$1&toe=$2 [L]
RewriteRule ^content/contact_gbforklifts/([0-9]+)/([0-9]+)/([0-9]+)$ index.php?pg=contact_gbforklifts&toe=$2&pid=$1&buy=$3 [L]

#News Articles
RewriteRule ^news/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?pg=news&article=$1&nid=$2 [L]

#Shop
RewriteRule ^categories/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?cat=$1&cid=$2 [L]
RewriteRule ^categories/([-0-9a-zA-Z_]+)/([0-9]+)/([0-9]+)$ index.php?subcat=$1&scl=$2&cid=$3&child=2 [L]
RewriteRule ^categories/subcat/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?subcat=$1&cid=$2&spl=1 [L]
RewriteRule ^products/([-0-9a-zA-Z_]+)/([0-9]+)$ index.php?product=$1&pid=$2&sp=1 [L]
#End

#Create 404 Error Page
RewriteRule ^errors/([0-9]+)$ index.php?pg=error&type=$1 [L]
#End

#PHP REQUESTS = ERROR PG
RewriteRule ^([-0-9a-zA-Z_]+)\.php index.php?pg=error&type=404 [L]
#End

#Rewrite sitemap for SEO
RewriteRule sitemap\.xml site-map.php [L]
#End

#REWRITE ANY PHP REQUEST TO HTM
RewriteRule ^([-0-9a-zA-Z_]+)\.htm$ $1.php [NC,L]
#End

#Forward 404 Errors to Error Page
ErrorDocument 404 index.php?pg=error&type=404
#End


Is it something to do with the size of the file, I think I remember 404 pages had to be a certain size or something for it to work (maybe just for i.e 6?

Thanks a lot for your help, much appreciated :)

lucy24

11:23 am on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the error document is smaller than, I think, 512bytes (almost said 512K, oops), some versions of MSIE will show you a default instead. But you'd have to really try to make an error document that small.

I gotta say I have serious reservations about all this error-document business. There's a finite number of errors that can really occur, and only so many things you can do about them. Why can't you simply make each ErrorDocument a static page? There should definitely not be any reason for rewriting. Rewrites are for when you want the user's address bar to show one thing while the screen is showing content from somewhere else. But an error document is itself a special kind of rewrite; the user will never see its "real" address anyway, just the address they wanted to go to.

A 404 error will show the content at /index.php?pg=error&type=04

Notice the type=04 part. This occurs because the $4 backreference you accidentally used in the rewrite target will be blank.

Eek, that reminds me. I recently had an Issue with my text editor because I wanted it to use something like $11-- that is, capture #1 followed by literal number 1. It refused, apparently thinking I wanted to use capture #11 which of course doesn't exist. (Limit is 9.) If I can find a way to test this in htaccess without making my site shut down, I will do so.

drewjuk

12:56 pm on Feb 1, 2012 (gmt 0)

10+ Year Member



The index.php is just a page which with includes the big pages are template.php etc.. so the index.php is a tiny tiny page with about 4 lines of code in.

I am on mozilla firefox latest so shouldn't be an issue I just remember years ago it being a problem!

P.S Just tested this: "ErrorDocument 404 /index.php?pg=error&type=404
" on my server and it worked fine must be something to do with xampp.

But I do have issues with this now: "RewriteRule ^([-0-9a-zA-Z_]+)\.php index.php?pg=error&type=404 [L]"

It seems to stop the other commands working RewriteRule ^content/([-0-9a-zA-Z_]+)$ index.php?pg=$1 [L] I tried changing the index.php?pg to index.htm?pg= but neither seem to work.

huff you fix 1 problem you get another!

g1smd

7:54 pm on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



#REWRITE ANY PHP REQUEST TO HTM
RewriteRule ^([-0-9a-zA-Z_]+)\.htm$ $1.php [NC,L]

The comment is incorrect. It does not do that.

It rewrites any incoming .htm request to fetch content from the .php file.


I'm not sure what you're thinking here:
#Create 404 Error Page
RewriteRule ^errors/([0-9]+)$ index.php?pg=error&type=$1 [L]

What it says is this: "if there is an external request for this URL:
example.com/errors/404
then show the content at
/index.php?pg=error&type=404


When your ErrorDocument directive is triggered it does not make an "external" request for a URL, so this rule will never run. The ErrorDocument triggers an internal filesystem request using this directive:
#Forward 404 Errors to Error Page
ErrorDocument 404 index.php?pg=error&type=404

drewjuk

9:28 pm on Feb 1, 2012 (gmt 0)

10+ Year Member




#Create 404 Error Page
RewriteRule ^errors/([0-9]+)$ index.php?pg=error&type=$1 [L]

This rule works fine there is an external call for it, I have created a function which redirects i errors/404 ir 310 etc... when a page or product does not exist any more.

Yes comment is wrong way round thanks for pointing that out!

g1smd

9:53 pm on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You must not "redirect" to an error page. This invents infinite URL space and will reported as "soft 404 errors" in WMT.

When there is a page or product that no long exists you must return the 404 status code at the originally requested URL.

lucy24

12:39 am on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What it says is this: "if there is an external request for this URL: example.com/errors/404 then show the content at /index.php?pg=error&type=404

When your ErrorDocument directive is triggered it does not make an "external" request for a URL, so this rule will never run.

But it's not constrained to external requests. That was exactly the problem I had recently with infinite redirects. The site was trying to serve up an error document-- ending in html, like everything else-- but kept running into code that blocked html requests from the person getting the 403. And similarly with .php: if I block all php requests, it also blocks auto-indexing because that uses php internally.*

drewjuk, the above is tangential. I think g1 and I agree that your whole error-handling routine is wonky. And it's drawing attention away from your other redirect/rewrite issues.


* I'm still figuring out how to block unwanted robots from piwik.php, which is handled as an external request. They get there via piwik.js, which can't be whitelisted.

drewjuk

8:59 am on Feb 2, 2012 (gmt 0)

10+ Year Member



Hi,

Right OK I never knew that, I had changed the headers on forwarding.

The trouble is the page which is not there any more actually still exists its the dynamic result from the database which is not there any more, so if you had index.php?pg=1&val=1 but val 1 did not exist any more you have to do a redirect with a header change on the way or it will just get a page with an error on it?

Does that make any sense?

Thanks for your help

drewjuk

10:40 am on Feb 2, 2012 (gmt 0)

10+ Year Member



The headers come back correct for the forwarding pages when checking with on-line header tools and firefox header tools:

Options -Indexes
Options +FollowSymLinks

#Rewrite engine
RewriteEngine on

#Main Pages
RewriteRule ^page/([-0-9a-zA-Z_]+)/([0-9]+)$ /Pages/$1.php?pid=$2 [L]
RewriteRule ^content/([-0-9a-zA-Z_]+)$ /index.php?pg=$1 [L]
RewriteRule ^content/([-0-9a-zA-Z_]+)/([0-9]+)$ /index.php?pg=$1&toe=$2 [L]
RewriteRule ^content/contact_gbforklifts/([0-9]+)/([0-9]+)/([0-9]+)$ /index.php?pg=contact_gbforklifts&toe=$2&pid=$1&buy=$3 [L]

#News Articles
RewriteRule ^news/([-0-9a-zA-Z_]+)/([0-9]+)$ /index.php?pg=news&article=$1&nid=$2 [L]

#Shop
RewriteRule ^categories/([-0-9a-zA-Z_]+)/([0-9]+)$ /index.php?cat=$1&cid=$2 [L]
RewriteRule ^categories/([-0-9a-zA-Z_]+)/([0-9]+)/([0-9]+)$ /index.php?subcat=$1&scl=$2&cid=$3&child=2 [L]
RewriteRule ^categories/subcat/([-0-9a-zA-Z_]+)/([0-9]+)$ /index.php?subcat=$1&cid=$2&spl=1 [L]
RewriteRule ^products/([-0-9a-zA-Z_]+)/([0-9]+)$ /index.php?product=$1&pid=$2&sp=1 [L]
#End

#Search
RewriteRule ^search/ index.php?pg=search [L]
RewriteRule ^search/([-0-9a-zA-Z_]+)/([a-z]+)/([0-9]+)/([0-9]+)$ /index.php?pg=search&sq=$1&sort_price=$2&rpp=$3&pgn=$4 [L]
#End

#Create External Error Pages
RewriteRule ^errors/([0-9]+)$ /index.php?pg=error&type=$1 [L]
#End

#Rewrites PHP requests to 404 PG
#RewriteRule ^([-0-9a-zA-Z_]+)\.php /index.php?pg=error&type=404 [L]
#End

#Forward sitemap
RewriteRule sitemap\.xml site-map.php [L]
#End

#Rewrite any HTM requests to PHP
RewriteRule ^([-0-9a-zA-Z_]+)\.htm$ /$1.php [NC,L]
#End

#Forward Eror Pages
ErrorDocument 404 /index.php?pg=error&type=404
ErrorDocument 301 /index.php?pg=error&type=301
ErrorDocument 410 /index.php?pg=error&type=410
ErrorDocument 500 /index.php?pg=error&type=500
ErrorDocument 401 /index.php?pg=error&type=401
#End

everything seems to work except #Search and #Rewrites PHP requests to 404 PG.

Thanks

lucy24

7:33 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



everything seems to work except #Search and #Rewrites PHP requests to 404 PG.

Well, the second item is a non-problem. As noted eight or twelve times above, 404 and Rewrite should have nothing to do with each other.

#Search
RewriteRule ^search/ index.php?pg=search [L]
RewriteRule ^search/([-0-9a-zA-Z_]+)/([a-z]+)/([0-9]+)/([0-9]+)$ /index.php?pg=search&sq=$1&sort_price=$2&rpp=$3&pgn=$4 [L]

There's one glaring problem for starters. The first Rule doesn't have an ending $ anchor, so it will pick up everything beginning in search/

Simply putting the two Rules in the proper order-- the more complicated and specific one first-- will take care of part of the problem. Possibly even all of it; you don't say exactly what the problem was.

g1smd

7:58 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I mentioned earlier that some other rules need to change order.

These specific rules...

RewriteRule ^content/contact_gbforklifts/...


RewriteRule ^categories/subcat/...


...need to each go directly before the more general rules for each base folder.


You can also combine your two
Options
lines into one line.

drewjuk

8:08 pm on Feb 2, 2012 (gmt 0)

10+ Year Member



Yes sorry g1smd I realised that earlier but forgot to re post! After reading your posts again I changed some rules around "more specifics ones first" and that has fixed all issues except the block php files!

Not sure how to over come in, I don't mind internal requests I just want to block users when they put .php in the browser!

There must be a way!

Thanks again for all your help!

g1smd

8:23 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To block external .php requests add a RewriteRule that tests for \.php and before it a RewriteCond that tests for \.php in THE_REQUEST.

Use the [F] flag on the RewriteRule to block the request. The rule target is not a URL, just a hyphen for "no target".

drewjuk

2:58 pm on Feb 6, 2012 (gmt 0)

10+ Year Member



Thanks for that:

Tried a few things can't get it to work:

#Rewrites PHP requests to 404 PG
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^/([-0-9a-zA-Z_]+)*.php [L]
#End

This does not work for files which are not in main dir:

#Rewrite any HTM requests to PHP
#RewriteRule ^Pages/([-0-9a-zA-Z_]+)\.htm$ $1.php [NC,L]
RewriteRule ^([-0-9a-zA-Z_]+)\.htm$ $1.php [NC,L]
#End

Works for index.htm but not Pages/index.htm any ideas?

Thanks

lucy24

5:32 pm on Feb 6, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Each Rule is missing something. Not the same something. And the Pattern in Rule 1 has a mistake that will make it fail consistently. It's got a second mistake that could make it behave unexpectedly, but that's secondary.

Spend a little time puttering through earlier threads and/or the Apache Rewrite docs [httpd.apache.org]. Or just take a day off and look at the Rules again. It will probably hit you in the face too.
This 49 message thread spans 2 pages: 49