Forum Moderators: phranque

Message Too Old, No Replies

Backreferences not working in mod rewrite

         

camsoft

12:30 pm on Mar 27, 2007 (gmt 0)

10+ Year Member



Hi,

I've just reinstalled our web server, installing mysql, apache (with mod_rewrite) & php.

One of my scripts uses a .htaccess & mod_rewrite to tidy up server URLs, see following example:

http://www.example.com/products/testproduct/ => http://www.example.com/index.php?__path=products/testproduct/

So the user requests the friendly url and we send the virtual path "products/testproduct/" to a php script using a GET var. I have been doing this using back references. Since I reinstalled our web server this no longer works.

The actual rule looks like this:

RewriteRule.*$index.php?__path=$0 [L]

The rewriting is working but the __path variable is empty when viewing via php. It seems that there are no back references. $0 should return the whole regex match. Also all of my Rewrite Conditions are also broken which use back references.

Any help would be greatly appreciated.

Cameron.

[edited by: jdMorgan at 1:23 pm (utc) on Mar. 27, 2007]
[edit reason] Example.com [/edit]

jdMorgan

1:28 pm on Mar 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The code is invalid: Allowed back-reference numbers are $1 through $9 for RewruteRule pattern back-references, and %1 through %9 for RewriteCond back-references.

I acknowledge that $0 can seem to work, but it is not in the specification and results are unpredictable across HTTP requests and server versions.

To create a back-reference, enclose the regular-expressions pattern or desired matching subpattern in parentheses.

Jim

camsoft

1:55 pm on Mar 27, 2007 (gmt 0)

10+ Year Member



It's strange because this was working perfectly before I reinstalled the server. It also works on our hosting providers servers.

I tired changing the regular expression to the following:

RewriteRule (.*)$ index.php?__path=$1 [L]

But It still does not work and my __path variable still has no value when viewed using php.

jdMorgan

2:17 pm on Mar 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is your code in .htaccess?

If so, you will need to explicitly prevent recursion, which can also cause this problem:


RewriteCond $1 !^index\.php$
RewriteRule (.*) index.php?__path=$1 [L]

This prevents the rewrite from invoking itself (because having been rewritten to "index.php", the new URL-path still matches the ".*" pattern, and will be rewritten again).

Invalid code may work, but that is no guarantee that it will continue to work. Depending on 'hacks' makes the code very fragile, and subject to breaking at the very next upgrade. I advise you not to use any technique not documented in the Apache documentation.

Jim

camsoft

2:26 pm on Mar 27, 2007 (gmt 0)

10+ Year Member



Yeah my code is in a .htaccess file, I was not aware I was using any hacks, I was just using standard regular expression syntax, which seemed to work.

My .htaccess file looks like the following:

RewriteEngine On

RewriteCond %0 (^¦/)[^/\.]+$

RewriteRule .* $http://%{HTTP_HOST}%{REQUEST_URI}/ [R,L]

RewriteCond %0!^index.php$

RewriteCond %0!^default.html$

RewriteCond %0!^(.*).php(.*)$

RewriteCond %0!^(.*).png(.*)$

RewriteCond %0!^(.*).gif(.*)$

RewriteCond %0!^(.*).jpg(.*)$

RewriteCond %0!^(.*).jpeg(.*)$

RewriteCond %0!^(.*)awstats.pl(.*)$

RewriteCond %0!^(templates¦images¦includes¦uploads¦tools¦backoffice¦config¦setup)(/.*)?$

RewriteRule (.*)$ index.php?__path=$1 [L]

The RewriteConditions are to exclude certain file types from being rewritten.

jdMorgan

3:15 pm on Mar 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well then, there's no need for the anti-recursion RewriteCond I proposed, because the third RewriteCond in your current second ruleset should prevent .php files from being rewritten.

I'm sorry, but this code is so unorthodox, that I don't know where your problem might lie. According to my eye, there is at least one mod_rewrite-syntax error on every line, and some lines have several. The (invalid) %0 variables seem to be undeclared/undefined. In addition, the filetype exclusions could be reduced to a single RewriteCond.

Because of the unorthodox approach taken, there are several points on which I'm confused, and so my ability to comment is limited. For example, the purpose of the first rule is unascertainable.

Based on the mod_rewrite documentation, the code is a house of cards, and it's no surprise that it has fallen.

An exact equivalent but orthodox coding of the second ruleset would be something like this:


RewriteCond $1 !\.(png¦gif¦jpe?g¦php)
RewriteCond $1 !^(templates¦images¦includes¦uploads¦tools¦backoffice¦config¦setup)/?
RewriteCond $1 !^default\.html$
RewriteCond $1 !awstats\.pl
RewriteRule (.*)$ /index.php?__path=$1 [L]

I reordered the RewriteConds on the basis of exiting the rule as soon as possible for the most-frequent non-rewritten requests; i.e. images first, directories next, and then the rest. I also eliminated a lot of cruft, like ".*$" and other anchoring redundancies.

Replace the broken pipe "¦" characters above with solid pipes before use; Posting here modifies the pipe characters.

For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Jim

camsoft

3:27 pm on Mar 27, 2007 (gmt 0)

10+ Year Member



Jim, thats very helpful, my regular expression / mod_rewrite skills are limited.

The rules are alot cleaner than mine, thanks for that. But Im still having excactly the same problem, all the RewriteConditions are failing so If I request the folder [test.com...] I should see a apache directory listing, as images is a real folder, but insted I see the index.php page which suggests that my request for the images folder is being rewritten, which is odd because there is a rule in the .htaccess to exclude it.

I think I might uninstall apache and re-install version 2 cause this is starting to annoy me now.

camsoft

4:23 pm on Mar 27, 2007 (gmt 0)

10+ Year Member



I managed to sort the problem by removing Apache 1.3 and installing Apache 2.2, the .htaccess mod_rewrite rules are now working.

Thank you Jim for your help.