Welcome to WebmasterWorld Guest from 23.20.82.60

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

pageviews using urchin

urchin stats

     
9:16 pm on Jul 26, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 15, 2002
posts:54
votes: 0



I'm trying to figure out my website pageviews stats using urchin. Does the Googlebot and Scooter spidering bloat the # of pageviews or do they get separated.

TIA,
eveningwalk

9:04 pm on July 27, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 27, 2003
posts:107
votes: 0


Urchin just includes them as surfers and counts them for hits and visits; at least the versions ISPs give you. Some programs such as WebTrends have filters you can use to include or exclude in this case. That lets you exclude a host ID like googlebot. In this case you must have a separate profile without the exclude filer.

Kevin

6:12 am on July 28, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 15, 2002
posts:54
votes: 0



So how do I create a separate profile? Is it some kind of a configuration file? I'm using the ISP version of Urchin.

TIA,
eveningwalk

2:00 pm on July 28, 2003 (gmt 0)

New User

10+ Year Member

joined:June 25, 2002
posts:40
votes: 0


Urchin lets you set exclude filters too, though you need an admin account to do it. Maybe you could ask your host to set one up for you?
3:07 am on Aug 2, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 15, 2002
posts:54
votes: 0



I don't have administrative priveleges. Can I not download the raw logs and run Analog on my client machine? Does Analog separate the wheat from the chaff (the spidering bloat)?

TIA,
eveningwalk

3:21 am on Aug 2, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 29, 2002
posts:64
votes: 0


Sure, as long as you can download your log files, you can run Analog on your local machine. Analog can definitely separate out the spiders, either by user-agent or by IP address. It does take a bit of jigging with the config file to get set up, though, and some regular minor maintenance to keep the spider list up to date.
8:36 am on Aug 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 15, 2002
posts:54
votes: 0



Thanks, Peter. I assume I'll need to add the HOSTEXCLUDE command to the Analog config file. Now from where do I get a complete spider list (IP adress and/or host name)?

TIA,
eveningwalk

5:51 pm on Aug 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 29, 2002
posts:64
votes: 0


Yeah, I use both BROWEXCLUDE and HOSTEXCLUDE, but I rely more on BROWEXCLUDE. (It looks to me like spoofed UAs make up only a tiny fraction of total traffic, although there are occasionally obvious spiders with faked or empty UAs that you have to filter out by IP address or domain name).

I don't know where you can download a complete list--I have just built up my own over time. Usually I run Analog, look through the Browser Summary for UAs I don't like, copy them from the report and paste them right into the exclude list, then re-run Analog.

Over the last year, I've added about 200 entries to my exclude list (which, for convenience, is in a separate config file that I call from the main config file).

I'm sure I'm missing some low volume bots that grab one or two pages here and there, but the big ones make up most of the spider traffic anyway. The little ones are just statistical noise. Server log file analysis is an inherently imperfect exercise anyway!

Hope this helps,

Pete

6:01 pm on Aug 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 29, 2002
posts:64
votes: 0


Oops, forgot to mention--if you also do FILEEXCLUDE for things like

/_vti_bin/*
/default.ida*
/scripts*
/sumthin*

etc., you'll filter out all that virus and scanning junk, which doesn't give you any useful information.

7:41 pm on Aug 5, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 15, 2002
posts:54
votes: 0


Thanks for all the info - Peter

eveningwalk