Forum Moderators: phranque

Message Too Old, No Replies

File type statistics?

Need a util to generate stats on a website

         

solistus

6:57 pm on Mar 28, 2006 (gmt 0)



I work for the ITS dept. at my university, and we want to generatte statistics about our site that tell us how many page requests we're getting for different file types (.html, .pdf, etc.). We've been trying to find a web crawler of some sort to do this, but the ones that are out here seem geared toward search engine technology, so and we can't find one that does what we need, let alone without having to index and copy the entire site. We can't just use log file analysers or other local solutions, since the content is divided between several servers. Does anyone know of a piece of software that can do this for us, or will we have to write our own?

Thanks,
Soli

trillianjedi

11:39 am on Mar 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Soli and welcome to WebmasterWorld!

Tricky question this one - I don't quite see how you could use a "spider" to help you achieve this. From your post, as I understand it, you need to know how many times a particular file type has been requested from a set of servers. A spider won't do that - the only thing that can tell you is the server itself, usually via it's logfile.

I can see the issue with having multiple logfiles across multiple servers, but is there any reason you couldn't grab a copy of all of them, consolidate into one file and then run a logfile analyser over that?

What a spider could tell you, is the relative proportions of different filetypes residing on the set of servers (you could crawl everything and count the filetypes as you go). But that doesn't seem to me to be the raw stat you require which appears to be requests and not content?

TJ