Forum Moderators: phranque
I am trying to (for lack of a better word) "cloak" this file browser perl script I wrote. It lives at /cgi-bin/fm.pl (A ScriptAlias directory). I'd love for it to be able to redirect to /browser/ (a non-existant location) then rewrite part of the query-string into the /browser%1 URL so that it looks like you are just browsing inside a folder and don't ever see "/cgi-bin/fm.pl?go=/whateverurl/". What I need help with (aside from possibly a whole rewrite of my rules) is my redirection from /cgi-bin/fm.pl to /browser
Here's the entire relevant section in my .htaccess file in my Document Root.
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^go\=(.*)
RewriteRule ^cgi-bin/fm\.pl$ /browser%1 [R]
RewriteRule ^browser(.*)$ /cgi-bin/fm.pl?go=$1 [L,NE]
#2. Line 4 is the one that is giving me my headache and I can't for the life of me figure out what the problem could be. It seems that the RegEx section of the rule never catches anything. I've even tried setting it to be the only active rule, and disabling the condition but still no luck. Here's the weird part. If I take out the directory / (so the url looks like "/cgi-binfm.pl" which doesn't really exist) and take it out in the browser, it works perfectly as expected. I read about, on here, sometimes these rules work only on a per-directory basis. I tested this theory by rewriting the url to an actual directory like /web/dynamic/links.shtml and sending it to /wow/ and that worked perfectly even though it's directory levels deep into the server. The only thing left I can think it could be is that it's not working because it's a ScriptAlias directory and isn't matched like other real directories. I tried putting a modified .htaccess file with updated URLs in my cgi-bin folder, but no luck there either.
The ultimate goal is to get the following to work.
[servername.com...] -> [servername.com...] [With URL/Browser Redirect]
[servername.com...] -> [servername.com...] [Do not redirect, and "cloak" the CGI's true URL]
This way, OLD urls to the script will be automatically forwarded to the new url, while NEW urls are secretly passed to the script.
Notes:
- My hours of testing haven't been in vein because I have found lots of things that don't work, and lots of things that are close but no cigar and do work.
- I have also learned from reading the first line of all responses on this forum is that query strings aren't matched by the RewriteRule.
- I run my own server, RewriteLog is enabled (presently)
- The perl file *must* stay in the cgi-bin directory.
- The perl script will always have a query string.
- The query string will only consist of 1 name and 1 value. Name->"go", Value->"/some/url/"
Thanks so very much in advance. I really appreciate the help.
So, it appears that the main problem is conceptual -- Simply put, it appears that you are trying to rewrite in "the wrong direction."
This is the correct format for a RewriteRule used to do an internal URL-path rewrite:
RewriteRule ^pattern-matching-requested-URL-path$ /local-server-path-to-real-file-object [L]
So, in this case, the pattern should match "browser/<something>" and the substitution URL should point to your local script (alias) path. ScriptAlias will then detect that local script path, and deliver the request to your actual cgi-bin directory.
Jim
RewriteEngine on
RewriteBase /
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cgi-bin/fm\.pl\?go=([^&]+)\ HTTP/
RewriteRule ^cgi-bin/fm\.pl$ http://www.example.com/browser%1? [R=301,L]
#
RewriteRule ^browser((/[^/]+)+)/?$ /cgi-bin/fm.pl?go=$1 [L]
As written, this code also omits the slashes on "go=/dynamic/" since there is no need for them and they are ugly.
Jim
I just tested out the code you posted, with the replacement of example.com with my domain and it still didn't catch the redirect to /browser/. It loaded the URL just as if the rule hadn't been there at all. Is my apache possessed?
Rule 2 (/browser/whatever/ -> /cgi-bin/fm.pl) now also behaves funny. It seems to only want to work when some extra "[^/]+/" exists. Ex: if i were loading the url "/browser/games/Quake/", it would only send the "?go=/games/" to the script because of the ((/[^/]+)+)/ trailing + and / so to actually get the Quake/ sub-dir of games/ you need to add some extra stuff to satisfy the RegEx, like "/browser/games/Quake/x/'.
I was thinking in my original rule writing that since the first conditional rule was lacking the L switch, that it would simply continue rewriting the URL until it got to an L or reached the end of the file. Though I remember reading on here that only 1 rule is ever applied to a single request. If so, What's the point of the L switch?
If it's not too much trouble, I'd love to understand your solution a little more. For the sake of learning, it looks like the condition text has similar syntax to a rule, except with escaped white-spaces. If the explanation is already written out somewhere, you can just let me know so you don't have to waste time typing it out again.
The rule, as written, requires only one occurrance of </one-or-more-characters-not-a-slash> in the requested url-path "/browser</one-or-more-characters-not-a-slash><optional-trailing-slash>", as specified by the "+" quantifier on the outside of "(/[^/]+)+" -- It will accept one or more sequences of "</one-or-more-characters-not-a-slash> but it requires only a minimum of one. By enclosing that in an "outside" side of parentheses, we take any/all of those matched sequences and store them in the variable "$1".
So, there's something else going on with rule #2, and I cannot spot it, so it is likely outside the context of the code we're discussing.
The most likely question about the RewriteCond (I assume you've read the mod_rewrite documentation many times) is the form of the variable THE_REQUEST. This is the entire request header sent by the client (e.g. browser) and looks something like this:
GET /somepage.php?var1=foo&var2=bar HTTP/1.1
[L] stops processing for *this pass* through the mod_rewrite code within the current HTTP request. However, if a rewrite is invoked, then the server will re-process all the mod_rewrite code, and [L] does not stop that. If an external redirect is invoked, then that ends the current HTTP transaction, and the client will (usually) begin another one. So in either case, [L] only ends process for the current pass through the code. For the sake of efficiency, however, I use [L] on every rule where it is not implicit unless I have a reason not to use it. Some functions, such as [G] and [F] imply [L] as well, so including it with them is redundant.
Because of the limited-scope function of [L], it is necessary to explicitly prevent rewrite/redirect loops in mod_rewrite code in .htaccess files.
Jim
I cannot explain why it doesn't work either. That's what's got me so distressed. There are no rewrite rules in my apache conf either. The only thing I can assume is that it's because of some ScriptAlias directory oversight in the server code (for my version). This .htaccess file living at DocumentRoot is the only .htaccess file in the entire DocumentRoot directory tree; as well as none in any alias directory.
It appears that literally no rule that refers to anything inside /cgi-bin/ will work as expected so far. I notice nothing in the apache docs CHANGES_2.0 file that addresses Rewrite, except the vulnerability corrected in 2.0.59.
Do you have any other ideas of what I could try?
Again, I really appreciate your wisdom and assistance, Jim. Thanks
If you wish to pursue this further in that light (security), then you can make a cgi directory --called say, cgi-local-- and either copy your scripts into that directory or symlink it to cgi-bin. Then refer to your scripts as if they reside in cgi-local, and mod_rewrite will work on them. Alternately, create a 'virtual' directory, refer to that when accessing scripts, and then rewrite that to cgi-bin. Because this directory doesn't actually exist, it won't be 'aliased-away' before mod_rewrite gets ahold of it. The downside to this approach is that it works only on HTTP accesses; Server-side includes of those scripts will still need to use the 'real' path.
Jim