Forum Moderators: phranque

Message Too Old, No Replies

mod_rewrite problem for "cruft-free" urls

caught in a circular logic loop

         

amznVibe

11:34 am on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought I was oh-so-clever to remove the .php from one of my websites by changing this pattern:

[example.com...]
to
[example.com...]

using this in htaccess:


RewriteRule ^news\/(.*)\/(.*)$ /news/$1.php$2 [T=application/x-httpd-php,NC,L]

(works great)

But now I noticed that there are some links out there on external sites that still use the old .php How do I redirect them to the new pattern without getting caught in a loop?

This fails badly before or after the above line:


RewriteRule ^news\/(.*)\.php(.*)$ http://example.com/news/$1/$2 [NC,L,R]

It loops continuously and I think I understand why, it's negating the previous line. I thought the "L" should stop that, no?

So how can I break that cycle and still tell bots/browser there is a redirect?

RonPK

2:49 pm on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Add a condition that makes your rule ignore requests ending on .php.

Should be something like this:

RewriteCond %{REQUEST_URI} !\.php$ 
RewriteRule [your original rule here]

Please note that I haven't tested this...

[edited by: RonPK at 3:28 pm (utc) on Aug. 21, 2004]

jdMorgan

3:02 pm on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The [L] flag only stops rewrite processing for the current HTTP request.

Because your second rule contains (and should contain) an [R] flag, this causes the server to issue a 302 redirect response back to the browser. The browser gets the new URL from that response and then issues a new HTTP request.

RonPK has addressed a possible cure, but I'd also like to suggest an improvement to your original rule as well:


RewriteRule ^news/([^/]*)/([^/]*)$ /news/$1.php$2 [T=application/x-httpd-php,NC,L]

It is not necessary to escape the "/" characters, and using "[^/]*" instead of ".*" is less ambiguous and far more efficient. [^/]* means "any number of characters except (i.e. up to the next) slash." I'd also suggest using "R=301" in any rule that you intend to use to tell a search engine to replace an old URL. [R=301] means "moved permanently," whereas [R] or [R=302] means "moved temporarily."

Jim

amznVibe

3:26 pm on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ah, I was waiting for the master to show up :)

However even with the condition it's still short circuiting between the two rules.

Apache simply doesn't know which rule to obey.

The url comes in as example.com/news/20040820.php
so the second (new) rule changes it to a redirect for example.com/news/20040820/
(simply to show properly in the visitor's browser or search engine index)
but then it goes through the rules again and hits the first rule which changes it back to the .php and the cycle starts over again!

jdMorgan

4:17 pm on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right, as long as there is an external redirect involved, this is what happens. You can't have it both ways. :(

Jim

Longhaired Genius

4:31 pm on Aug 21, 2004 (gmt 0)

10+ Year Member



Surely the Multiviews option was made for this situation.

From [httpd.apache.org...]

A MultiViews search is enabled by the MultiViews Option. If the server receives a request for /some/dir/foo and /some/dir/foo does not exist, then the server reads the directory looking for all files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client's requirements, and returns that document.

I've mentioned Option MultiViews a time or two in the past but never got any reaction. I think it would make life easier for many people. It's one of the first things I add to a new site's .htaccess.

amznVibe

4:41 pm on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is there no way to set an environment flag to correct the situation on redirect?

I guess I could add something like ?redirect=1 to the url and test for it on the 2nd pass, hmm

jdMorgan

4:44 pm on Aug 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've got a bias in the other direction... MultiViews very often causes problems resolving directories (when the trailing slash is omitted), and causes other problems when it's enabled but not needed. Sometimes, it's more efficient CPU-wise to use an ad-hoc rewrite if only a few filetypes need to be rewritten and full content-negotiation isn't required.

But everyone can and should make their own choices, so this is certainly an option here.

Jim