Welcome to WebmasterWorld Guest from

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

PDF file downloads

how many are bots?

11:28 pm on Oct 19, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Mar 21, 2004
votes: 0

I am trying to figure out how to gauge how many actual HUMANS are downloading a particular 4-5 MB PDF file.

I publish a 32 page color magazine and in addition to sending out "real" copies of it, I also released a web-friendly PDF version that was Free to download.
The ONLY link to it was on one particular page of the website.

In my stats I noticed that it got downloaded 100-200 times a day, but the webpage that the link was on only got maybe 20-30 hits a day.

So either a number of people are emailing the direct link to the file (which is fine)
Bots are crawling the linked file
People are cancelling the download for whatever reason and then re-downloading it again. Over and over.

When my stats showed that it had been downloaded about 9000 times, I replaced the 32page file with one with the same name that was a single page that basically said "Sorry, the timeframe for the free download is over"
Overnight, the traffic plummetted to a couple of downloads and then nothing even though the LINK on the webpage was still there.

So I don't know what to make of this.
Was the complete file downloaded 9000 times?
Was the link accessed 9000 times but not completely downloaded each time?

or is there something odd about PDF files that skews the stats?
I once put up a PDF file in a protected directory (by protected, I mean it had a blank index.html file and no links to it) and then I emailed the link to 3 people.

The next day, my stats said that the file had been accessed 75 times, but each of the 3 people I sent the link to said they had not even read the email yet, much less accessed the file.

are PDF files just problematic in general or is it me?
or is it my stats thingie?

4:18 pm on Oct 20, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 14, 2004
votes: 1

PDFs get broken up into little pieces by the download manager, so in a basic stats package they look like many requests. Pay attention to the visits.

If you open your logs you will see that the first hit is a 200 code but the remaining pieces of the PDF are 206's. If you want to get really precise you can attempt to take this into account. But most analysts just look at the visits column of their stats.

4:46 pm on Oct 20, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Mar 21, 2004
votes: 0

thanks, that explains most everything for me!