homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

pageviews using urchin
urchin stats

 9:16 pm on Jul 26, 2003 (gmt 0)

I'm trying to figure out my website pageviews stats using urchin. Does the Googlebot and Scooter spidering bloat the # of pageviews or do they get separated.




 9:04 pm on Jul 27, 2003 (gmt 0)

Urchin just includes them as surfers and counts them for hits and visits; at least the versions ISPs give you. Some programs such as WebTrends have filters you can use to include or exclude in this case. That lets you exclude a host ID like googlebot. In this case you must have a separate profile without the exclude filer.



 6:12 am on Jul 28, 2003 (gmt 0)

So how do I create a separate profile? Is it some kind of a configuration file? I'm using the ISP version of Urchin.



 2:00 pm on Jul 28, 2003 (gmt 0)

Urchin lets you set exclude filters too, though you need an admin account to do it. Maybe you could ask your host to set one up for you?


 3:07 am on Aug 2, 2003 (gmt 0)

I don't have administrative priveleges. Can I not download the raw logs and run Analog on my client machine? Does Analog separate the wheat from the chaff (the spidering bloat)?



 3:21 am on Aug 2, 2003 (gmt 0)

Sure, as long as you can download your log files, you can run Analog on your local machine. Analog can definitely separate out the spiders, either by user-agent or by IP address. It does take a bit of jigging with the config file to get set up, though, and some regular minor maintenance to keep the spider list up to date.


 8:36 am on Aug 3, 2003 (gmt 0)

Thanks, Peter. I assume I'll need to add the HOSTEXCLUDE command to the Analog config file. Now from where do I get a complete spider list (IP adress and/or host name)?



 5:51 pm on Aug 3, 2003 (gmt 0)

Yeah, I use both BROWEXCLUDE and HOSTEXCLUDE, but I rely more on BROWEXCLUDE. (It looks to me like spoofed UAs make up only a tiny fraction of total traffic, although there are occasionally obvious spiders with faked or empty UAs that you have to filter out by IP address or domain name).

I don't know where you can download a complete list--I have just built up my own over time. Usually I run Analog, look through the Browser Summary for UAs I don't like, copy them from the report and paste them right into the exclude list, then re-run Analog.

Over the last year, I've added about 200 entries to my exclude list (which, for convenience, is in a separate config file that I call from the main config file).

I'm sure I'm missing some low volume bots that grab one or two pages here and there, but the big ones make up most of the spider traffic anyway. The little ones are just statistical noise. Server log file analysis is an inherently imperfect exercise anyway!

Hope this helps,



 6:01 pm on Aug 3, 2003 (gmt 0)

Oops, forgot to mention--if you also do FILEEXCLUDE for things like


etc., you'll filter out all that virus and scanning junk, which doesn't give you any useful information.


 7:41 pm on Aug 5, 2003 (gmt 0)

Thanks for all the info - Peter


Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved