Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond & RewriteRule ignores suffix?

mod_rewrite

         

iantresman

12:42 pm on Jan 30, 2011 (gmt 0)

10+ Year Member



Requirement: To convert a virtual static URL (with a .html suffix in the root), to a real dynamic web address. eg.:

/keyword.html -> /scripts/search.cgi?query=keyword


My quirky .htaccess mod_rewrite:

RewriteEngine On 
RewriteCond %{REQUEST_URI} \.html$
RewriteRule ^([^/]*)\.html$ /intro/search.cgi?query=$1 [L]


The problems: (1) URLs that have other suffixes, are also rewritten, eg. keyword.xyz, in which $1 becomes the word "missing" (2) subdirectories are also processed, eg.
/docs/nonkeyword.htm
(but I suspect this is a variation of (1).

I though that RewriteRule is processed only if RewriteCond is true, and according to the REGex Tester [regextester.com],
\.html$
should match only .html suffixes?

Hence, I wanted this to work only on
/keyword.html
that exist in the root, and nowhere else. What have I misunderstood? My server is using Apache 2.

g1smd

9:03 pm on Jan 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The above rule is not responsible for that behaviour.

There's some other Apache options you should turn off or disable.

iantresman

9:22 pm on Jan 30, 2011 (gmt 0)

10+ Year Member



That would explain a lot. Now I just have to track down the offending Apache configuration directive [httpd.apache.org].

It's also odd, because if I turn off the RewriteEngine, or remove the RewriteCond & RewriteRule , then the problem does not appear to happen. So perhaps the directive is one that affects either the RewriteEngine, or how htaccess deals with RewriteOptions?

iantresman

10:08 pm on Jan 30, 2011 (gmt 0)

10+ Year Member



I tracked down some culprits, but none of them seem to work. I added the following to my .htaccess file:

AcceptPathInfo Off 
Options -MultiViews
options +FollowSymlinks


I tried disabling
-FollowSymlinks
, but then I just get the Apache 2 Test Page. I see there are some other posts [google.co.uk] on the subject too.

g1smd

10:52 pm on Jan 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, AcceptPathInfo and MultiViews were the settings I was struggling to remember the names for.

The two "Options" should appear on a single line.

iantresman

11:30 pm on Jan 30, 2011 (gmt 0)

10+ Year Member



Thanks for the tip. I've tried them, but unfortunately they don't seem to make a difference, though I'm sure MultiViews is the cause.

jdMorgan

2:53 am on Feb 1, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Simplifying-out the redundant RewriteCond, just the following should work:

AcceptPathInfo off
Options +FollowSymLinks -MultiViews
#
RewriteEngine On
#
RewriteRule ^([^/]+)\.html$ /intro/search.cgi?query=$1 [L]

In your error example above, the word "missing" is likely being inserted by your script upon find the value for name "query" to be blank.

If AcceptPathInfo and Content-Negotiation are already disabled, then you've likely got some other rule somewhere that's interfering. Also, look for 'sneak paths' such as another script that "includes" your search.cgi script.

Jim

iantresman

8:58 pm on Feb 1, 2011 (gmt 0)

10+ Year Member



Thanks for that. I reduced down my htaccess file to only the four lines you suggested, but mistyped typos still produce the "missing" substitution, and I have no other active htaccess files.

Is it possible that
Options -MultiViews
line is not disabling Content-Negotiation, in which case, how could this be tested? I don't have root access to the server. I tried looking through phpinfo() for a clue.

jdMorgan

7:47 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Is it possible that Options -MultiViews line is not disabling Content-Negotiation...

No, unless the server installation is corrupted. Not very likely to happen and even more unlikely to result in only such a subtle symptom.

> in which case, how could this be tested?

Comment out the rule, then request a few URLs such as foo.html and /intro/search.cgg?query=something-valid-here and /intru/search.cgi?query=something-valid-here

If such nonce and mis-spelled requests still invoke your script, you can be sure that some other "agency" is rewriting these requests to your script. That could be server config code (ask your host), content-negotiation, AcceptPathInfo, mod_speling, mod_dir (for extensionless URLs only), mod_proxy, and possibly others which I've forgotten here as well.

If you're sure that none of your own config code or scripts is doing any rewrites or redirects, then a call to your host would be in order.

Jim