Page is a not externally linkable
wilderness - 12:55 am on Aug 4, 2004 (gmt 0)
With the above in mind and with it understood that this is NOT a realistic approach to stats! You have many references to the term "hits." If you have used the term hits "correctly" and as most web log softwares use the term than have you deteremined the quaity of hits for each page of the website? The file size of the PDF is ingsignificant as it depends enttirely on may more variable methods of which a PDF may be created. The bots which crawl websites vary so much from website to website that it's impossible to measure any frequency. I have major bots (google, msn and all the Yahoo bots crawling my sites constantly and daily. There is no beginning and no end. Even Jeeves which may use the most consistent method, grabs the majority of pages in one crawl and then returns throughout the month with partial crawls. The major bots have NO predertimined date (such as the 3rd or 10th) of each month to begin their new crawls. What of the unknown and malicious bots? They have no frequency, nor, have you mentioned any method of preventing or identifying these bots? In the end for you, I believe the most accurate method of stats may be, to use a before and after this period your intersted in with "accurate and full logs" and make an adjustment for any varitions in the differecnes between those two periods.
Somehow I must come up with an answer and I am looking for the most realistic approach, which certainly will not and is not expected to be perfect.
Hits is a beginners stat and the hits on a page are the the page itself and the total number of images on that page.
EX:
If the page includes four images than that page will return five hits.
Or have you just mistakenly misused the the term hits and are referring to actual visitors?
Do the PDF logs conatin lines for images?