Forum Moderators: phranque

Message Too Old, No Replies

Why does this 404?

... with anything but R=301

         

rocknbil

11:26 pm on Jan 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




RewriteCond %{QUERY_STRING} ^redir\=.+$
RewriteRule . /cgi-bin/redirect-handler.cgi?r=%{QUERY_STRING} [L]

/cgi-bin/redirect-handler.cgi is present, works fine by direct URL.

If I add R=301, it's fine, but the URL changes, of course. As soon as I remove R=301, I get a 404. Set it to R=200 or any other status I get the status error with internal configuration error. I have no access to httpd.conf, it's a semi-standard Linux box as far as I can tell.

g1smd

12:33 am on Jan 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What is the test URL that causes the error?

I assume the result of calling
example.com/foo?redir=bar
will be
/cgi-bin/redirect-handler.cgi?r=redir=bar


The two = signs will likely be the problem.

jdMorgan

1:47 am on Jan 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It should result in an infinite loop, because the output path matches the rewriterule pattern and the rewritecond... Perhaps you meant:

RewriteCond %{QUERY_STRING} ^redir=(.+)$
RewriteRule . /cgi-bin/redirect-handler.cgi?r=%1 [L]

Here, /<anything>?redir=foo will be rewritten to /cgi-bin/redirect-handler.cgi?r=foo

Note that the <anything> path info (if present in the requested URL-path) is dropped (lost) in this example.

If additional query parameters may be present and must be preserved, then the code get a little complicated. If that's the case, then let us know.

Jim

rocknbil

6:44 pm on Jan 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I assume the result of calling example.com/foo?redir=bar will be /cgi-bin/redirect-handler.cgi?r=redir=bar


Correct, but don't see how the two equal signs are the problem (though it is "clunky," see below . . .)

It should result in an infinite loop, because the output path matches the rewriterule pattern and the rewritecond


It didn't, and I don't see how? ^redir and ^r are not the same (?) However as always your solution is much more efficient (though I think I did try that in one permutation) but the result is the same: without

[R=301,L]

I get a not found for /cgi-bin/redirect-handler.cgi. Which is weird, it's obviously there, and functional, I can request it with test URL's like the one g1smd demonstrated.

If it's helpful, this is redirecting URL's on an acquired site server A to reciprocal URL's on server B. It's been determined that the link juice is irrelevant and to make it easy for the users, with an interim message as to why they've landed here (which is the only reason for the query string, to hook it from regular requests.) Server A 301's server-a/file.html as server-b/file.html?redir=file.html (I could just make this some trivial unique string, but having the file name in query string is handy.) redirect-handler.cgi parses out the query string (which is why the two ='s were of no real consequence.) The goal here, redirect-handler.cgi captures only these requests, gets the requested file with curl, and outputs it with a couple lines of Javascript to lightbox the interim message, leaving the entire site files untouched (which is difficult anyway, they are all in Wordpress.)

I know I'm doing this the hard way and recommended an innocent cloak as described here [webmasterworld.com] but the powers that be are too worried about it. :-/

rocknbil

6:41 pm on Jan 27, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We've implemented this, everyone's OK with it except me. :-) Any idea what might cause the error without a 301? Without a 301 the URL might "almost make sense."

jdMorgan

9:35 pm on Jan 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A 301 ends the current HTTP transaction and outputs a URL. That's important, because in doing so, Apache "cleans up" any filepath weirdnesses that may have accumulated due to multiple internal rewrites. These usually take the form of multiple instances of some of the filepath-parts. This is a known bug which was supposed to be fixed with Apache 2, but my testing has shown that the bug remains.

Is there any useful info in your server error log? Anything that shows that the rewritten filepath is wrong or that it contains "repeats" of part of the path info?

If so, you can craft a rule to work around that, or better, find the other rule(s) that are also rewriting this URL before it gets to this particular rule.

Another thing that could be a problem is that your cgi-bin path may be Aliased in the server config, and if so, then the Alias will kick in before the rule can run. So this rule would have to be moved to the cgi-bin path in order to execute.

Jim