Forum Moderators: DixonJones
I have a rather newbie question, but it definitely needs answered before I go crazy. My site host is WebSite Source and they have a control panel that has a site statistics package called "http-analyze 2.4".
My problem is I want to download the raw log files to analyze them locally with "WebLog Expert" and no one at there support desk know if they offer this. Now I have managed to find a folder that has files ending in .gz but there are seperate ones for each day. Do I have to combine these some how to get a statistal look at the whole month? Furthermore they only have them going back to the beginning of March. Is this normal?
Sorry for all the newbie questions on log files but I don't know what a "raw log file" entails and my host is clueless.
Thanks,
Josh
[edited by: Fence at 9:56 pm (utc) on April 18, 2003]
Now I have managed to find a folder that has files ending in .gz but there are seperate ones for each day. Do I have to combine these some how to get a statistal look at the whole month?
Those sound like the files you need. The extension .gz shows a kind of zip compression. Servers can be set to create a single file that just grows and grows until you interrupt the process -- or they can output one file a day, which I find to be much more manageable.
Your analysis needs to include the single files for every day you want included in your overall "look", but you don't need to append all the files into one monster. That would be unwieldy anyway. Analysis software will usually aggregate stats from all the files you tell it to look at.
And if you really want to get simple about it, because log files are text files, you can use a free tool like grep to examine piles of logs in one look.
If you've only got the native notepad for working with text files, you'll probably want something a bit more heavy duty, because log files get BIG. I use EditPad and find it excellent - and there is a free "Lite" version. A Google search for 'text editor' will turn up others, each of which have their fans.
Furthermore they only have them going back to the beginning of March. Is this normal?
Given the file space that log files take up, most hosts will not keep them going back too far. Maybe one or maybe three months. If you want a historical record, you need to archive them locally. But to answer your question directly, yes this is normal.
The main features I need are:
1) What keywords used and from what search engine they came.
2) When the bots visit me.
3) Exit Pages
4) Tracking my CPC programs such as adwords.
I suppose these are all pretty standard features of many programs, correct?
Thanks,
Josh
Yes, javascript on every page is a common approach (and a serious limitation, IMO).
Certainly your first two requirements are as basic as you can get. Pulling KWs out of the search engine referers is essential, as is giving you the search engine name.
And tracking PPC is usually as easy as putting a query string on the end of your URL when you place the ad (example.com/product.html?src=ov for instance)
Exit pages are also a common feature - but you do need to pay attention to how an exit page is defined. No further click after how long?
Of course almost all the stats require attention to definitions, and it's rare that any two packages would come up with the exact same numbers because of this. So the greatest value comes from regular analysis and the comparison of units of time to spot trends.
It's amazing how challenging it is to create a technical definition for a stat that lines up with our common sense ideas of what we want to measure. Server logs were not created with merchant's needs in mind, they were created with technicians needs in mind.
The downside is you need to be able to define and write your queries, in some form. You need to be pretty technically minded to do this for many e-merics as opposed to just hit or page counting.
The advantage is you can be sure of what you are measuring, and do a lot of ad-hoc querying, not relying on a vendors definition of a unique user, session or repeat visit, which are sometimes very questionable.
Second, if the script calls out to a third party server, then your page load times are dependent on the responses of two servers. Third party servers are one of the common causes of page loads hanging, from what I've seen.
Finally, I don't see how a javascript-based tracking system can show you stats about spider visits - they definitely don't parse javascript.
I actually downloaded all 18 zipped files for march and combined them into a 30 meg text file. WebLog expert was then able to read this and give me my stats for the month. If I stay on top of it and add each days stats it should work fine.
Now is there a better program then "WebLog Expert" for $75?
Thanks all.