Forum Moderators: phranque

Message Too Old, No Replies

Massive repeat requests...

...and always on .pdf files!

         

mivox

6:07 pm on Sep 13, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I see this happening in my logs all the time: Someone visits our site, clicks on one (or more) technical documents (in PDF format), and generates 5-10+ almost simultaneous requests for that one file...

Is there something strange about the way PDF files are served on Apache that would cause this behavior? It's very consistent... I almost never see visitor who accessed a PDF file with a single request.

(Today someone broke the record when they accessed our site through a Google search, looked at 3 regular pages, then generated a massive barrage of 3000+ requests for a single pdf document... but I think that's a different problem altogether!)

Brett_Tabke

9:52 am on Sep 17, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



What you could be seeing is a browser attempting to grab the file more than once. Part of that could be the way the browser interacts with the plugin. Some browser request the PDF, and then look at the server header content type to see that it is indeed a pdf forma file. Then they call the pdf plugin or launch adobe. The plugin then can request the file again.

Sometimes in the process, people will have trouble with the download and request the file again - very common with alternative format files that require plugins.

zoidberg

7:52 am on Sep 23, 2001 (gmt 0)



I take care of a site that has 100's of pdf's on an Apache server, and I get that all the time.

It's irritating as it makes it difficult to gauge which files are the most popular, as the larger the file the more it is requested.

mivox

11:59 pm on Sep 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



100's of pdf's on an Apache server

That's about the same as mine... at least I know I'm not the only one seeing it. ;)

bird

8:38 pm on Sep 24, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You guys know that the plugin can fetch single pages from within a pdf file, right? I have no idea if the file must be prepared in a special way for this to work, but I have come to consider it normal, that when I switch to a new page in a largish pdf document, the content of that page will only then be downloaded on the fly.

This is probably not the reason for your 3000+ incident, but if by "almost simultaneous" you mean "within a minute or two", then that might well be it.

mivox

8:50 pm on Sep 24, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



fetch single pages from within a pdf file, right?

I never even thought of that! I'll have to start comparing the # of requests each file generates with the # of pages in the file. (Not that everyone will read every page in each file only once, but there ought to be *some* correlation...)

4eyes

8:58 pm on Sep 24, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Probably a download manager that splits the file into 5 or more parts and downloads them simultaneously.

eg Netants, Flashget, etc

Brett_Tabke

9:45 pm on Sep 24, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Very interesting 4eyes. I'd not considered that in a web setting. hmm. I wonder what the ramifications of it are. Do people use those at all for html files too?

4eyes

12:34 am on Sep 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not sure about a web setting, but I suspect that anyone 'right clicking and downloading' will trigger the download manager on a PDF file.

If it was a large PDF that seems likely.