Forum Moderators: DixonJones

Message Too Old, No Replies

access stats for _large_ sites

i feel like in a cloud of fog

         

muesli

10:49 pm on Aug 7, 2002 (gmt 0)

10+ Year Member



hi,

i'm in the strange situation that i know virtually everything about user and spider access to my two private sites but _nothing_ about the large site i'm commercially responsible for.

as to what my software and operations guys tell me logging and tracking would cost truckloads of money. they have turned off the webservers writing logfiles for most things.

my site is ~180 million PIs a month, mostly dynamic, runs on 35 alpha webservers and a database machine (simplyfied..).

is it true that the additionally needed hardware (storage, CPU, webserver performance) and the licence for a logfile analyzing software (unlimited edition) would cost half a fortune? we're quite tight on budget these days..

what i miss most is SE traffic and keyword statistics. what would you suggest?

muesli

mnorton

10:30 am on Aug 8, 2002 (gmt 0)

10+ Year Member



I currently manage the stats for a company that gets ~15 Million Page impressions a month (5-600MB of logs a day) which is not a lot compared to yours. It costs quite a bit initally in buying a new server and possibly a licence for some analysis software but once purchased it should last you a while, there are many free products out there such as Analog. I currently use NetTracker which I find very useful however the version that would suit you would cost in the region of $25,000 which is not a small sum and then you have the support and server costs on top. It may be worth investing in a half decent server and one of the free analysis programs, as this will save you a bundle.

Also you could try looking into page tagging methods which mean that you do not need to have logs and the company providing this service will store all the data for you to products that come to mind are WebTrends Live and SiteMeter not too sure on the costs of these but you will have to pay a monthly fee for this service

Thanks

Mike

Grumpus

12:09 pm on Aug 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not getting nearly the page hits you are getting, so I've got regular logging turned on still, but even that can be too much to sift through. What I've done on my site pares down the logs considerably and gives me a good look at the unique visitors as well as let me see where the spiders are.

This was done in ASP using the Global.asa file. I don't know about PHP, so I can't say how to do it there, but I'm certain there's some way to do something similar.

In essence, what I do is write a line every time a new session starts on my site (i.e. the first hit anyone makes on my site). This line consists of the referer (good to see where traffic's coming from, finding search terms that are hitting, etc), the IP address, the entry page (to see what people hit first - on my site, the front page only makes up about 10% of the total entry-points on my site), and the user agent.

I like this because it let's me know the MAIN things I'm looking for - who, what, where, when, and how. If I want to see people's habits once they're on my site, I've got to go to the main logs - and to be honest, I rarely care what someone does once they get there until the get to the last step - buying something (which I see in my sales logs). I DO like to track spiders, though and since they don't take a cookie, it makes a log entry for each page they hit.

Every couple of days, I just delete this file (or, if there's something interesting, I zip it up and it gets nice and small).

As far as doing fun stuff with the file, I've written a little program in VB that imports the file (it's pipe delimited) and I can sort by all types of stuff. The program took about 1 hour to write, so if you set something like that up, but don't have VB, you could easily hire someone to put it together for $100-$150.

A lot of it depends on what you really need to know (and what you want to know). I determine my "search terms" visually as writing a script would add an extra hour to the development and once you know what each referrer looks like, you can tell at a glance how they found you.

G.