Forum Moderators: phranque

Message Too Old, No Replies

deliver 404 on ?= except when preceded by /store/<somepage.php>

mod_rewrite request token

         

rsleventhal

12:36 pm on Jul 20, 2011 (gmt 0)

10+ Year Member



Hi Folks,

I'm new here and have searched for 'the' answer to this using mod_rewrite, but so far can't seem to get the required regex.

Goal: deliver a 404 page if there's a page request (?=<something>) after a URL *except* when the url is in or below the /store/ folder.

Currently [domain.com...] delivers the index.php page, not a 404. I *do* get a 404 if I put in a page that isn't there, but for some reason apache2 is not parsing the entire URL and deciding that a 404 isn't needed for index.php?=somerandomstring OR index.php/somerandomstring

I'd be grateful for any pointers, suggestions or help.

thanks in advance,
-Ray

lucy24

6:47 pm on Jul 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Short answer: Query strings aren't really part of the url-- by default they are simply ignored in rewrites-- so I think apache is actually doing what it's supposed to do. If you don't want it to do this, you have to grab anything with a query string

RewriteCond %{QUERY_STRING} .

(meaning "if the query contains anything whatsoever") and send it out to a php script that will determine if the query is valid.

One way to avoid unintended smileys ;) is to hide them inside "code" markup:

(?=<something>;) after a URL


Oh, yes, and use example.com for the same reason. It doesn't get auto-converted into an active (and hence no longer readable) link.

rsleventhal

7:13 pm on Jul 20, 2011 (gmt 0)

10+ Year Member



Short answer: Query strings aren't really part of the url-- by default they are simply ignored in rewrites-- so I think apache is actually doing what it's supposed to do. If you don't want it to do this, you have to grab anything with a query string

RewriteCond %{QUERY_STRING} .

(meaning "if the query contains anything whatsoever") and send it out to a php script that will determine if the query is valid.

One way to avoid unintended smileys ;) is to hide them inside "code" markup:

(?=<something>;) after a URL

Oh, yes, and use example.com for the same reason. It doesn't get auto-converted into an active (and hence no longer readable) link.


Thank you *so* much. That not only makes sense, it's easy to implement.

I, of course, have a particular directory level which *should* accept the query_string(s).

So, I can't globally implement the rewrite condition without also saying (to apache) something like:
When the URL has
/store
in it, it's ok to use the query string.

Example:
http://www.example.com/index.php?=somerandomstring

should produce a 404, but
http://www.example.com/store/index.php?main_page=index&cPath=4

should parse the query string and pull from the (zen cart) database to render the page.

Might you be able to provide the correct .htaccess syntax for that?

Thanks again for the tip!

-Ray

lucy24

8:02 pm on Jul 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When the URL has /store in it, it's ok to use the query string.

You need a rule that says (in English) if the request does not contain store/ (you can do that part yourself) but does contain a query string
RewriteCond %{QUERY_STRING} .

then call it a 404 rather than going to the queryless version of the page.

But there are a few more questions before we get to the "Dammit, just give me the gun!" Probably the most important one is where the superfluous query strings are coming from. Are there bad links floating around on the internet? Malformed links in your own site? Recent redesign?

Next: Do you really want your unsuspecting user to get hit with a 404 "ain't no such page"? It's trivial to simply remove the query string: if the rewrite ends with a bare ? then the existing query is deleted.

If a request comes in for, say,
www.example.com/somerandompage?morestuffhere

does the user get sent by default to
www.example.com/index.whatever

or to
www.example.com/somerandompage

? I mean by default, if you don't send them somewhere else.

rsleventhal

8:14 pm on Jul 20, 2011 (gmt 0)

10+ Year Member



Valid questions. Most of this stems from a scan initiated by a company hired to find PCI compliance issues. I personally like the idea of the failed query string going to the page that is prepended to the query, but they say it is a hint of a vulnerability and so I'm stuck.

Good questions, though...thanks again!
-Ray