Forum Moderators: phranque
I'm trying to use a Rewrite Rule that all requests to a file with an extension (.zip, .exe, .avi etc) will go first to a script (PHP script in this case), and after that will grant access to the file.
The purpose of the script it's for access granting, statistics etc.
The problem is the script redirect to the required file will go in an endless loop.
I'm using this rule:
RewriteRule \.(avi¦exe¦pdf¦zip)$ /scripts/script.php [NS,L]
I tried IS_SUBREQ [httpd.apache.org], but without success:
RewriteCond ${IS_SUBREQ} false
RewriteRule \.(avi¦exe¦pdf¦zip)$ /scripts/script.php [NS,L]
This is the script redirect:
<?php
header ("Location: http://".$_SERVER["HTTP_HOST"].$_SERVER["REDIRECT_URL"]);
?>
So, given the current situation, thers is no way to differentiate a user-invoked request from a scripted-redirect-invoked request. Subrequests don't even enter into the equation... The original request and the scripted redirect are two entirely separate and --as far as Apache is concerned-- completely-unrelated HTTP requests. HTTP is a "stateless" protocol -- Each request for a page, or image, script, or CSS stylesheet on that page exists on it's own, and Apache does not have any built-in way to relate them.
Another problem is that this approach gives the 'private URL' to the client by way of the redirect. So, any time they refresh their screen or click on a bookmark saved after the redirect, they bypass your script entirely.
A better approach might be to move all this protected content into a directory that is not accesible from the Web. That is to say, a place that is accessible as part of the server's filesystem, but cannot be reached via HTTP. YOu can move these files to a directory 'above Web root' or simply place them in a subdirectory that is password-protected or made HTTP-inaccessbile using acccess-control directives in .htaccess.
Then modify the script so that instead of doing an HTTP redirect, it instead takes the requested HTTP URL, changes it into a filename in the protected filespace, and then opens the requested file, reads it, and outputs the file data to the requesting client -- this latter step being no more than a 'print' statement in PHP.
But another step is required, since all we've done so far is to make the HTTP-inaccessible filespace accessible via HTTP again. You need some way of checking that the user is authorized to access your content. I'd suggest making your script check for a cookie before opening the file in the protected space. Then modify the pages on your site that serve to 'authorize' the visitor so that they set the required cookie. An alternative is to use server-side user sessions to do the same thing.
So our example in step-wise form looks like this:
Visitor views 'authorization' page - Maybe just your home page or any page with protected content included.
Page contains JavaScript or triggers server-side code to set a short-term session cookie on the visitor's machine.
Visitor requests .avi file.
Script checks for presence and validity of cookie, and if correct:
- Opens .avi file in protected area,
- Reads data from .avi file
- prints data
- exits.
Else if the cookie is missing or incorrect, print an error message and exit.
There are many ways to do this, and the above is just an example. The key is to distinguish the difference between external HTTP URL requests and internal server filesystem requests.
Jim
Here I'll explain the entire situation, and you'll see why I ended up using this solution.
The content of the website (html, php files) are stored in one location, and all downloadable files (pdf, swf, etc) are stored in another location.
1st time, all the protected files were password protected using acccess-control directives in .htaccess. Because the users are stored in a MySQL DB, i'm using mod_auth_mysql.
All working fine, until I wanted to protect the index page where the protected files are listed. Because this file is in another location, and besides that, the entire website it's processed using the main index.php file, I can't use another .htaccess where the listing page it's located.
For example, here is the page where the files are listed: www.mysite.com/products/index.php. In the backend, this file it's included in www.mysite.com/index.php.
The solution that I found was to enable authentication using PHP (using sessions)
With this solution, I have to get rid of .htaccess file from protected folder, because I don't want the user to authenticate 2 times.
To enable auth with PHP, I have 2 solutions:
1. to keep .htacess file in protected folder and to emulate htaccess auth with PHP using header('WWW-Authenticate: '). In this manner, if the user logged in on displaying the listing of the files, will not need to auth 2nd time (if I'm using the same Realm in PHP auth and .htaccess file);
2 to remove .htaccess file, to make a script and execute this script before downloading any protected file. In this script I can verify if the user it's logged.
1st solution should be the best for me, but I want to create statistics for downloaded files. In that MySQL table some users are blocked, so I don't want to grant access for downloading. I can block them with PHP auth, but I can't on .htaccess file (I can use conditions in .htaccess file with mod_auth_MySql 3.0, but it's buggy).So it seems that I need that script.
I tried to open a downloadable file using PHP, and reads it, but I have this problem: the file it's starting to download, but if I want to use the browser window to navigate away from that page, I can't. Until the file is not downloaded, the browser window it's stucked.
I tried this on 3 different environments:
W2k3 - Apache/2.0.55 (Win32) PHP/5.0.5
Win2000Server - Apache/2.0.54 (Win32) PHP/5.0.4
Linux Fedora - Apache/2.0.52, PHP 4.3.10
It's the same problem.
Also, I tested it on Linux Environment, Apache 1.3.31, PHP 4.3.8, and here it's working!
So, I suppose there is a problem with Apache 2.x. I think it's all about multithreaded mode from Apache 2.x.
That's the reason I ended up using header('Location ...');
The files are pretty big (200-300MB), so reading files using PHP I don't think that's such a good idea.
The best solution seems to be with redirect. Isn't there any solution to run a script before downloading a file? (this should be transparent for the user, and I can't change all the links on the website).
Here is the PHP script with the dw function, in case you need it:
$file = 'full/path/to/file.exe";
function dwfile($file)
{
if(isset($_SERVER['HTTP_USER_AGENT']) && preg_match("/MSIE/", $_SERVER['HTTP_USER_AGENT']))
{
// IE Bug in download name workaround
ini_set( 'zlib.output_compression','Off' );
}
header ('Content-type: ' . mime_content_type($file));
header ('Content-Disposition: attachment; filename="'.basename($file).'"');
header ('Expires: '.gmdate("D, d M Y H:i:s", mktime(date("H")+2, date("i"), date("s"), date("m"), date("d"), date("Y"))).' GMT');
header ('Accept-Ranges: bytes');
header ('Cache-control: no-cache, must-revalidate');
header ('Pragma: private'); $size = filesize($file);
if(isset($_SERVER['HTTP_RANGE']))
{
list($a, $range)=explode("=",$_SERVER['HTTP_RANGE']);
//if yes, download missing part
str_replace($range, "-", $range);
$size2=$size-1;
$new_length=$size2-$range;
header("HTTP/1.1 206 Partial Content");
header("Content-Length: $new_length");
header("Content-Range: bytes $range$size2/$size");
}
else
{
$size2=$size-1;
header("Content-Range: bytes 0-$size2/$size");
header("Content-Length: ".$size);
}
if ($file = fopen($file, 'rb'))
{
while(!feof($file) and (connection_status()==0))
{
print(fread($file, 1024*8));
flush();
}
$status = (connection_status()==0);
fclose($file);
}
return($status);
}
dwfile($file);
Thank You for your time!
Razvan
Add something you can use to distinguish redirected URL, which doesn't harm the operation.
I'd suggest adding query string.
<?php
header ("Location: [".$_SERVER["HTTP_HOST"].$_SERVER["REDIRECT_URL"]."?abcde");...]
?>
Then, you can skipp RewriteRule using that.
RewriteCond %{QUERY_STRING}!^abcde
RewriteRule \.(avi¦exe¦pdf¦zip)$ /scripts/script.php [NS,L]
But you should understand that people can see how it works, and access these items with query string and bypass your php script.
There are different ways to do this sort of thing, too.
The script just serves to 'pipe' the content to authorized users only.
Jim
GET /folder/index.php?foo=bar&quux=foo HTTP/1.1
Jim
Jim, I guess it's the same thing if the files are HTTP-accessible or non-HTTP-accessible. I still have that problem with the script.
If I had all the files in a DB, and all the links like : "download?id=xx" it was very simple. But I don't :(
I don't know if this should work:
RewriteCond %{SCRIPT_FILENAME}!^/scripts/script\.php$
RewriteRule \.(avi¦exe¦pdf¦zip)$ /scripts/script.php [NS,L]
<?php
header ("Location: http://".$_SERVER["HTTP_HOST"].$_SERVER["REDIRECT_URL"]);
?>
I need to find a variable [httpd.apache.org] that it's changing its value after using header("Location...") function and to make a RewriteCond with it.
I really need to find a solution with Mod Rewrite instead of php script.
Thanks all,
Razvan
p.s. Jim, I can't see how %{THE_REQUEST} - or ${REQUEST_URI} [The resource requested in the HTTP request line] could help me. Please, can you explain? Using header("Location...") i'm not making a rewritten request, right?
As HTTP doesn't have the notion of 'session' itself,
you have to use cookie or script or something.
Anything that comes into the request header can be used by RewriteRule.
But only QUERY_STRING and Cookie are the one usually used.
If you use HTTPS, then it's a different story.
You can use session id of SSL to distinguish the request.
Other than that, we can use IP address.
Although, this is better than Cookie, IMO, it may still fail with
users coming from revolving proxy like AOL.
And as you may know already, serving the download with a script isn't good option if the file is big or the site is very busy
In short, it's not always easy to control downloads on SHARED hosting.
On VPS or dedicated machine, you have lots of options.
If you just want to control the bandwidth, you can monitor downloads with a cronjob and restrict access when thingds get hot.
In this case you only need raw log and crontab.
You don't need to use the initial script (unless you wstill want to do something with it).
I have a script I can copy&paste, if you want.
If you want to have precise control over to whom you allow the download,
you have to use password protection, or session control of some sort.
Example1.
You use something similar to what you have, and in the initial script, you create a file using the IP as the name.
Then redirect the user to a URL in the download directory.
In the .htaccess of the download directory, you put something like this:
RewriteCond /path-to/ip-dir/%{REMOTE_ADDR}!-f
RewriteRule ^ - [F]
And you can remove the IP file using cronjob.
The frequescy of the cronjob and how you delete file
(often using find with mtime check) will determine the session life.
If you prefer, you can add any other request header part to the name of the control file, to prevent PCs from same IP to cheet.
Example2.
When someone wants to download file, you issu one time username+password, and add it to the Apache's password file.
Then show the link for download location (protected with password).
This is relatively easy, and simple.
Remove the username:password from the password file with watchdog cronjob that checks for the completed download by that username.
(Raw log normally records the REMOTE_USER.)
Yoau can combine this with the example1 to increase the security and control.
Example3.
Use https. It's not good for huge files, but very easy and simple for tight control.
It's similar to example1, but uses SSL_SESSION_ID instead of IP.
You can use SSL related firectives such as SSLRequire as a bonus.
SSLRequire is quite powerfulm and it can read file.
Of cource, you can combine this with other methods.
Example4.
Use a download script, but not in heavy unreliable PHP.
By using small shellscript, it has less chance of failing with larger file, and uses far less resources.
On the server I use, PHP uses 9MB of memory just to start up.
Shell script uses only 340KB.
You can imagine the impact when there are many downloads for big file going on.
I have a samaple download script if you want to go this firection.
Well, I guess yet many other methods. :)
It depends on the type of downloads (file size, frequescy, server setup), and the type of control you want to exercise.