Forum Moderators: coopster
I posted a similar plea for help many months ago but didn't receive any responses so I let the issue go. Now I've got a client that has the requirement that "Word and Excel" document MUST be able to be downloaded and/or opened directly from his site.
I've been using a force-download function with happy success for both .pdf and .txt documents: If a user wants to open them directly from the link or download them and THEN open them, no problem... looks and works great.
Then there's Word and Excel documents. God. Whether I open them directly from the link, or download them and then open them, they're corrupted: most (and sometimes all)of the text in either one of these extension types are replaced by a mad smattering of weird code characters.
Now that I MUST address/fix this issue, I've searched this site, as well as the web at large for an answer to this problem with (unfortunatly) no success. I hope someone here can see what's going wrong.
**********
Unlike many force-download scripts, I prepare all data to be sent to the function first - File Name, Full Path, Extension, and File Size. When these vars are filled, they're passed to my forceDownload function.
As a real-world example of what is being fed to this function (for a Word Document) here is the output of the variables being passed to the function:
$fileName contains: test_document.doc
$filePath contains: _assets/download/s.006/p.002/test_document.doc
$fileSize contains: 24576
$fileExt contains: doc
$ctype contains: application/msword
**********
The function I'm using is shown here:
function forceDownload($fileName, $filePath, $fileSize, $fileExt)
{
switch($fileExt)
{
case "pdf": $ctype="application/pdf"; break;
case "exe": $ctype="application/octet-stream"; break;
case "zip": $ctype="application/zip"; break;
case "doc": $ctype="application/msword"; break;
case "xls": $ctype="application/vnd.ms-excel"; break;
case "ppt": $ctype="application/vnd.ms-powerpoint"; break;
case "gif": $ctype="image/gif"; break;
case "png": $ctype="image/png"; break;
case "jpe": case "jpeg":
case "jpg": $ctype="image/jpg"; break;
case "txt": $ctype="text/plain"; break;
case "mp3": $ctype="audio/mpeg"; break;
case "wav": $ctype="audio/x-wav"; break;
case "mpg": case "mpeg":
case "mpe": $ctype="video/mpeg"; break;
case "mov": $ctype="video/quicktime"; break;
case "avi": $ctype="video/x-msvideo"; break;
case "txt": $ctype="text/plain"; break;
default: $ctype="application/force-download";
}
header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);
header("Content-Type: $ctype");
header("Content-Disposition: attachment; filename=".$fileName.";" );
header("Content-Transfer-Encoding: binary");
header("Content-Length: ".$fileSize);
readfile("$filePath");
exit();
}
As mentioned above, .pdf and .txt formats are downloaded and opened without problem. Word and Excel formats, however, can be downloaded and opened, but they just contain junk characters (and sometimes a smattering of the original content).
Can someone please help me understand and fix this issue? Please? (Yes, I am begging).
Great appreciation to all in advance.
Neophyte
Thanks (as always) for your response.
Linked and tested both .doc and .xls directly - all went fine, so the files themselves weren't the problem.
The problem with these two file types is (I just found) that these two types don't behave well when headers (or anything else... maybe spaces, newlines, etc.) have already been sent - I don't know or understand why this wouldn't affect other file types.
So, I've solved the problem with a snippet from another script I found that will clean the buffers before the additional headers from my force-download script are sent.
After implementing this "fix", every document type I have can either be opened directly from the link, or downloaded and then opened, without corruption.
Honestly, I don't know if this is the "right" way of doing it, but it works.
Working code posted below for anyone else's information/use:
*********************
function ob_clean_all ()
{
$ob_active = ob_get_length ()!== FALSE;
while($ob_active)
{
ob_end_clean();
$ob_active = ob_get_length ()!== FALSE;
}
return FALSE;
}
function forceDownload($fileName, $filePath, $fileSize, $fileExt)
{
//Required for IE, otherwise Content-disposition is ignored
if(ini_get('zlib.output_compression'))
ini_set('zlib.output_compression', 'Off');
switch($fileExt)
{
case "pdf": $ctype="application/pdf"; break;
case "exe": $ctype="application/octet-stream"; break;
case "zip": $ctype="application/zip"; break;
case "doc": $ctype="application/msword"; break;
case "xls": $ctype="application/vnd.ms-excel"; break;
case "ppt": $ctype="application/vnd.ms-powerpoint"; break;
case "gif": $ctype="image/gif"; break;
case "png": $ctype="image/png"; break;
case "jpe": case "jpeg":
case "jpg": $ctype="image/jpg"; break;
case "txt": $ctype="text/plain"; break;
case "mp3": $ctype="audio/mpeg"; break;
case "wav": $ctype="audio/x-wav"; break;
case "mpg": case "mpeg":
case "mpe": $ctype="video/mpeg"; break;
case "mov": $ctype="video/quicktime"; break;
case "avi": $ctype="video/x-msvideo"; break;
case "txt": $ctype="text/plain"; break;
default: $ctype="application/force-download";
}
ob_clean_all();
header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);
header("Content-Type: $ctype");
header("Content-Disposition: attachment; filename=" . $fileName . ";" );
header("Content-Transfer-Encoding: binary");
header("Content-Length: ". $fileSize);
readfile("$filePath");
exit();
}
*********************
Neophyte
I think I finally figured out the problem. I am using the same "smart file download" script that prevents people from direct downloads. I had the same problem - PDF downloads were fine, however the MS Word documents seemed to be corrputed.
Well the problem was caused by an extra line between PHP code snippets. Like this:
<?php
...Some lines of code to set up default variables...
?><?php
... file download code...
?>
The solution was simple... I removed the extra line, like this:
<?php
...Some lines of code to set up default variables...
?>
<?php
... file download code...
?>
Now it's obvious what was happening - that extra space was treated as output to the page, and that corrupted the Word document. So it makes sense that your code to flush the header also solved the problem.
Wow, thanks for that - I never considered the extra lines. I'll alter my code without the "flush" and see if I get the same desired results. If not, it's something else I'm doing wrong and will have to put the flush back in until I can sort everything out.
Thanks for the input Tim!
Neophyte