Forum Moderators: DixonJones

Message Too Old, No Replies

Real Time Tracking

         

daveking

7:38 am on Apr 15, 2003 (gmt 0)

10+ Year Member



Newbie here, so if this is a daft question please tell me! In the recent Google update, I saw folks commenting that web crawlers had just visited their sites. How do you see, what appeared to be, almost live development of statisitics?

sugarkane

9:40 am on Apr 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld daveking

There are two general styles of tracking techniques.

Raw Log File Analysis

With this, you download your raw server logs and run them through a statistics package of some sort. You can get an idea of the most popular software for this here [webmasterworld.com]. This sort of tracking is non-real time, but can provide detailed statistics including trends, comprehensive referral data etc. Many hosts will also provide reports generated from this kind of software on a daily or weekly basis.

Real Time Tracking

The are various options for doing this kind of tracking. There are free tracking services [directory.google.com] that are generally run on an external server - usually, you link to their tracking system using a 1x1 image and pass the visitor IP address, referrer etc via Javascript. Useful for an overview of current site activity, but the stats are usually limited to some extent and you can't track most spiders with this method (spiders won't request the image to trigger the tracking, and won't run the necessary Javascript in any case).

There are also similar programs that you run on your own server and call using SSI [httpd.apache.org]. These programs have the advantage of live stats plus the ability to track spiders.

There is also another way of using your raw log files, although it is something of a minority sport. If you have access to them as they are generated, rather than only being able to download daily/weekly, you can load them into a text editor, or use *nix tools such as grep, to search them directly for whatever information you need - eg latest pages requested by Googlebot etc.