Forum Moderators: phranque

Message Too Old, No Replies

Browsers are unable to download large files from my web server

Is my server's file mime-type settings to blame?

         

SumGuy

1:19 am on Mar 24, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



There are times when a client needs to download a large file from my web server. The file is a zip file, about 100 mb in size.

When I look at the web logs, for some reason I see this file request take place in segments of 6 to 7 mb in size, the first generates a 200 code in the logs but the rest are 206. The end result is that maybe half the file is eventually downloaded this way (if I add up all the bytes sent). The clients usually are using Chrome. I *think* I've seen better luck with Firefox, but I can't be sure.

In the logs, for the first or initial request for the file, I see this:

text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng

I've looked at my server's mime list, I don't exactly see a mime-type for "zip" files, but I did just create one today (application/zip), not sure if it will make a difference. When it comes to this 206 code, I usually see it when clients request PDF files, I'm guessing that as they view a pdf file in their browser, the browser will request more chunks of the file as the user reads into the document. But I don't know why the browser is downloading ZIP files in chunks, unless again it was my (missing?) mime-type specifically for zip files.

Any ideas or insight here?

not2easy

1:58 am on Mar 24, 2025 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Are you using mod_deflate with file types to offer zip for pdf files?

SumGuy

4:27 am on Mar 24, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



I forget what this is called - encoding? In this instance the encoding is this: gzip, deflate, br, zstd (that's pretty common, I see that a lot). But in this case I see it just with the first request or "chunk". The remaining chunks (that generate code 206) I see "identity". Where does that come from? The browser I think yes?

And BTW, when the bots (google, bing, etc) grab my PDF files (which aren't huge, maybe a few mb and almost certainly under 10 mb) they do it in one shot.

lucy24

4:44 pm on Mar 24, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The 206 code is appropriate for this situation, since it means “partial content”. Frankly I wouldn’t want my server disgorging 100mb all in one go; some things should be broken into bite-size pieces. 10MB counts as bite-size; 100MB doesn’t.
https://www.rfc-editor.org/rfc/rfc9110.html#name-206-partial-content

It’s a longish article, but worth reading through. (If the fragment gets lost, it’s section 15.3.7. If you’re reading on a full-size device, there’s a table of contents along the right side.)

The .zip extension reflects what the user ends up with; it’s separate from any further stuffing that your server does before sending out a file--any file, not just 100MB monsters. On the same page, see also section 12.5.3. Accept-Encoding

SumGuy

11:35 pm on Mar 24, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



The mime-type is sent in the response header from the server to the browser (it doesn't show up in the logs, at least not mine). It's somewhat tricky to actually see what this is using the standard tools available in the typical browser. With FF I was able to use the inspect-element and muddle my way to see what I think is the mime-type of the file in question. This was after I had manually created a mime-type for zip files. I'm assuming this was not previously in place. I did indeed see this as "application/zip".

I also searched for an "on-line' mime-displayer, where you can enter a URL (that includes a file) and the site will tell you what the file mime-type is (as reported by the server - ie my server). There does not appear to be such an animal. What there is are sites where you can point a submission window to a file on your system and it will tell you the mime-type - not useful to me in this context.

I have the impression that it's the browser that decides whether or not to download an entire file in one request or in multiple requests (chunks), and it makes this decision based on the mime-type of the file. In the case of PDF files, if the correct mime-type is supplied, and the browser has a corresponding rule for it, then it will only download the file as-needed as the user scrolls through the file (if it is some sort of multi-page document that is). Or perhaps also a multi-media audio/video file. Obviously if the file is a static image (gif, jpeg, etc) it makes no sense to download it in chunks - it's only relavent or useful when it's totally downloaded in one piece. The same is true for compressed files (like zip, rar, etc).

I don't know what my server was serving up in terms of the mime-type for zip files before I added it. Possibly there was none, or maybe something generic? I see that locally when I request these files, FF and Edge request the entire 100 mb file in a single request, no chunking and no 206 codes. Just a single request with a 200 code.

thecoalman

4:48 am on Mar 25, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Possibly there was none, or maybe something generic?


I don't know if it's default but application/octet-stream is generic. You can use this to force the browser to download the file regardless of the extension or type of file. AFAIK you don't need any other mime type unless you expect the browser to do something other than download it. And just to be clear when I say download I mean save the file to downloads folder.

There is some weird behavior with this I would assume is browser dependent. In Firefox if it's image it just downloads the file, for .pdf it downloads it but then opens the local file in browser window,

SumGuy

2:51 pm on Mar 25, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



Usually, the browser (any browser) will just download unknown / unhandled file types (what else can it do?). The question there being - does it just download them straight to the default download location or does it throw up a box asking the user to give an alternate file name and download location - a choice that is configurable for all the browsers I know.

But that's not the point here.

Why do I see this file-request chunking? Which ultimately ends in download failure?

Do some (many, most, ?) browsers chunk when they see mime-type application/octet-stream, but they don't when they see application/zip ?

I should remove the application/zip mime-type and then see what my server is actually giving for mime-type, just assuming here it would be application/octet-stream. But if this is true, then maybe the "stream" part is denoting (incorrectly) an open-ended file of no specific length?

tangor

6:31 pm on Mar 25, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is this your server, or is the site on a shared host?

SumGuy

2:04 am on Mar 26, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



My server. It's running Abyss web server, running on a Win-NT4 box.

thecoalman

2:22 am on Mar 26, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I didn't try to answer your question about the chunking because I don't have one.

For Firefox and testing Edge(presumably any Chromium browser) it will download to default location. It doesn't even open new page. Not sure why Firefox opens .pdf, this doesn't happen with Edge.To be clear it downloads the file to default location and then opens the local file in new tab. This doesn't occur with images.

For context this is phpBB forum and files are served through php script. Out of the box phpBB serves all files except images using application/octet-stream but I've made some modifications for HTML5 A/V. I have separate link with parameter that will serve any file with application/octet-stream so the user has one click direct download.

SumGuy

2:28 am on Mar 26, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



Try downloading zip files instead of pdf. The issue I'm having is with zip files. Like 100 mb zip files. But again, see what your server is using (or going to use) as the mime-type for zip files.

SumGuy

12:52 am on Mar 28, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



I see today the first example of my zip-file download by an external user, where my server *should* be using the application/zip mime-type. The browser in this case was:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/134.0.0.0 Safari/537.36 Edg/134.0.0.0

I'm assuming that it's Edge.

The entire episode took 16 separate requests. The first request was logged with html code 200, the rest as code 206. Each request transfered 6.x to 7.x mb. Total bytes transfered was 106.8 mb - likely the entire file. The Encoding (Accept-Encoding) for first request was gzip, deflate, br, zstd. Encoding for the rest was identity.

Now, when I use a local computer on the same LAN as my web server to download this same zip file, I see this as a single request, code 200, 107 mb transfered. This is using Edge on a Win-10 machine (no updating is done on this machine so the edge version is older).

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36 Edg/117.0.2045.36

In this case, when I submit the URL directly in the address bar and hit enter, Edge immediately downloads the file directly to it's default download location without prompting me for anything. Why I'm not seeing this in the logs as multiple chunking requests (200, 206, 206, 206, etc) I have no idea.

lucy24

1:48 am on Mar 28, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you're not already logging request headers, this might be a good time to start. No use worrying about what your server is doing if in fact it's sending out just what the UA has asked for.

lucy24

5:32 pm on Apr 8, 2025 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



... and if the request comes in as HTTP/1.0, it is very unlikely the request is from someone or something you want to deal with. (I block 1.0 globally and poke holes as appropriate.)

SumGuy

3:22 am on Apr 10, 2025 (gmt 0)

5+ Year Member Top Contributors Of The Month



Looking at this "chunked" encoding, I read one description that "When the server needs to send large amount of data, chunked encoding is used by the server because it did not exactly know how big (length) the data is going to be."

Which does not make sense when serving a static file OF KNOWN SIZE. This still does not make any sense when I access the file from a PC on the local lan to the web server and it does not chunk in that case. This issue has sort of faded for now, I haven't done any more controlled tests from non-local web locations. I think (or, I thought) that a browser is initially told the file details early in the back and forth, and that the browser can request byte-ranges which would appear as chunking (and hence the 206 code). How or why the browser would do that, I have no idea.

I have said that when it comes to downloading PDF files - which are much smaller than the ZIP file that is the subject of this thread, I have seen chunking 206 requests (from what I am sure are human requestors) but when the bots (bing, google, apple) request those SAME files, it's all done in one shot, no chunking.
----------------

206 Partial Content:

This HTTP status code signifies that the server has successfully fulfilled a request for a portion of a resource, as indicated by the Range header in the request.