homepage Welcome to WebmasterWorld Guest from 54.204.128.190
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
pageviews using urchin
urchin stats
eveningwalk




msg:896201
 9:16 pm on Jul 26, 2003 (gmt 0)


I'm trying to figure out my website pageviews stats using urchin. Does the Googlebot and Scooter spidering bloat the # of pageviews or do they get separated.

TIA,
eveningwalk

 

khuntley




msg:896202
 9:04 pm on Jul 27, 2003 (gmt 0)

Urchin just includes them as surfers and counts them for hits and visits; at least the versions ISPs give you. Some programs such as WebTrends have filters you can use to include or exclude in this case. That lets you exclude a host ID like googlebot. In this case you must have a separate profile without the exclude filer.

Kevin

eveningwalk




msg:896203
 6:12 am on Jul 28, 2003 (gmt 0)


So how do I create a separate profile? Is it some kind of a configuration file? I'm using the ISP version of Urchin.

TIA,
eveningwalk

Deodato




msg:896204
 2:00 pm on Jul 28, 2003 (gmt 0)

Urchin lets you set exclude filters too, though you need an admin account to do it. Maybe you could ask your host to set one up for you?

eveningwalk




msg:896205
 3:07 am on Aug 2, 2003 (gmt 0)


I don't have administrative priveleges. Can I not download the raw logs and run Analog on my client machine? Does Analog separate the wheat from the chaff (the spidering bloat)?

TIA,
eveningwalk

PeterD




msg:896206
 3:21 am on Aug 2, 2003 (gmt 0)

Sure, as long as you can download your log files, you can run Analog on your local machine. Analog can definitely separate out the spiders, either by user-agent or by IP address. It does take a bit of jigging with the config file to get set up, though, and some regular minor maintenance to keep the spider list up to date.

eveningwalk




msg:896207
 8:36 am on Aug 3, 2003 (gmt 0)


Thanks, Peter. I assume I'll need to add the HOSTEXCLUDE command to the Analog config file. Now from where do I get a complete spider list (IP adress and/or host name)?

TIA,
eveningwalk

PeterD




msg:896208
 5:51 pm on Aug 3, 2003 (gmt 0)

Yeah, I use both BROWEXCLUDE and HOSTEXCLUDE, but I rely more on BROWEXCLUDE. (It looks to me like spoofed UAs make up only a tiny fraction of total traffic, although there are occasionally obvious spiders with faked or empty UAs that you have to filter out by IP address or domain name).

I don't know where you can download a complete list--I have just built up my own over time. Usually I run Analog, look through the Browser Summary for UAs I don't like, copy them from the report and paste them right into the exclude list, then re-run Analog.

Over the last year, I've added about 200 entries to my exclude list (which, for convenience, is in a separate config file that I call from the main config file).

I'm sure I'm missing some low volume bots that grab one or two pages here and there, but the big ones make up most of the spider traffic anyway. The little ones are just statistical noise. Server log file analysis is an inherently imperfect exercise anyway!

Hope this helps,

Pete

PeterD




msg:896209
 6:01 pm on Aug 3, 2003 (gmt 0)

Oops, forgot to mention--if you also do FILEEXCLUDE for things like

/_vti_bin/*
/default.ida*
/scripts*
/sumthin*

etc., you'll filter out all that virus and scanning junk, which doesn't give you any useful information.

eveningwalk




msg:896210
 7:41 pm on Aug 5, 2003 (gmt 0)

Thanks for all the info - Peter

eveningwalk

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved