homepage Welcome to WebmasterWorld Guest from 54.211.95.201
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Does Web Trends exclude spiders by default?
lozmatic




msg:3477627
 8:49 am on Oct 15, 2007 (gmt 0)

In September we got an average of 21 page views per visit during an average of 30 seconds.

Either our users have that ability to read whole pages in just a few pages or something is wrong with our stats...

I suspect that spider hits are being counted as pageviews too. Is this a possibility?

 

cgrantski




msg:3477786
 1:53 pm on Oct 15, 2007 (gmt 0)

Yes. You have to turn on (or make and turn on) spider filters.

This assumes you are using server logs as your raw data. If you are getting data through page scripting, most spiders will not be in those kinds of logs.

lozmatic




msg:3479640
 10:13 am on Oct 17, 2007 (gmt 0)

Thanks.

I've also realised that there are other agents that 'scrape' our content and publish it on their other sites. We actually don't mind this happening but it would be better to filter them out too.

Problem is, though, how are we going to determine which ones these are? Would it be possible to filter out data from IP addresses that show odd behaviour (eg. very high pageviews per visit ratios)?

lozmatic




msg:3479652
 10:45 am on Oct 17, 2007 (gmt 0)

One more thing...

I've just noticed that in September 34% of visits (as reported in the Site Design > Browsers & Systems > Platforms screen) came from spiders.

Would this explain why the average page views per visit is 21 in a 31 second (average) timeframe?

Either that or out users are very keen on our content and read pages very quickly.

I'm also wondering whether all this spider traffic is costing us given we regularly exceed our page views quota.

cgrantski




msg:3480416
 1:30 am on Oct 18, 2007 (gmt 0)

Yes, the high pageviews per visit ratio is a really good way to identify IP addresses needing to be filtered out. But if you make a WebTrends filter for those IPs, you'll still be charged for the act of filtering --- because WT is, in a way, processing those lines (by applying filters).

If you can write a script that removes those IP address lines and the strings in the User Agent field, you can save yourself a lot of pageview quota. There are a lot of ways to do such a script and you can also use something like Log Parser to do it. The lines have to be removed before WT processes the logs. You may find that once you've identified the main culprits you won't have to keep modifying your script, at least not more than once or twice a year.

WebTrends has a file called browsers.ini that has a section that contains the strings or IPs that are in its existing spider filters, so that's a place to start for values to filter. But you'll definitely find others by looking at the Visitors table. The best way to use the Visitors Table is in a profile that is set up to have IP/User Agent as its sessionizing method, because the Visitor Table will then display the IP/UA (instead of the cookie value).

(I'm assuming you're using WT software and can actually preprocess the logs)

I am pretty sure this is accounting for the strange pattern you're seeing, and you'll probably be pretty happy with how many page views you save. I've seen it be as much as 50% for mid-size sites.

lozmatic




msg:3480939
 3:20 pm on Oct 18, 2007 (gmt 0)

cgrantski, thank you very much for your feedback. I can now go back to IT and suggest what you have mentioned.

I've just been to a presentation for WT's new Marketing Lab 2 suite and must say that both the Visitor Intelligence and Score modules look impressive. But we gotta get our core stats looking half decent before doing anything remotely more sophisticated.

cgrantski




msg:3481034
 4:37 pm on Oct 18, 2007 (gmt 0)

Yes the new products are really a big jump up, very cool.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved