homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

Script for Scrubbing Log Files
Need script for scrubbing log files of spiders before WebTrends analysis

 3:29 pm on Apr 19, 2007 (gmt 0)

I need a script to scrub log files of spider user-agents. I don't have much scripting experience but have to write it myself or find one at no cost.

I am using WebTrends on an enormous dynamic site and the spider traffic is still unbearable despite efforts with robots.txt. Since WebTrends charges based on page views, the spider traffic is really costing us.

WebTrends support (understandably) doesn't want to give me the info for free since they provide the service at a cost, but I was able to get this: The script must evaluate each line to see if it contains a certain value and delete that line if it does, and then go on to the next line. It is supposedly a very simple small script, but I have no idea how to write that.

Can anybody advise me on this?



 3:32 pm on Apr 19, 2007 (gmt 0)

find a database of spider ips then

read line from logs
compare ip with db
if no match then write to file to send to webtrends
if match then read next log line

if this is a huge file then it could take quite a while to finish

we used to scrub image requests et al too (not sure if that matters for you or not), for an example of the apache conf we used see here



 4:10 pm on Apr 19, 2007 (gmt 0)

Thanks jatar_k. This gives be a better idea of the direction I need to go in. I will look around for that spider database.

I don't think images are a problem because they don't charge up page views in WebTrends.

Just hoping there is a free script out there somewhere or if someone has an idea of where to look for something similar that maybe I can use as an example. Please forgive my inexperience; I'm not sure how to write the code or what it would be written in as I'm very new to this. I've written some VBscript and Javascript for some ASP Web applications and that's about the extent of my experience. Even for that it's been a few years and I now mostly do front-end Web and analytics.


 9:14 am on Apr 24, 2007 (gmt 0)

There is a free tool that can doo this.
"Logparser 2.2" can be downloaded from Microsoft. You can use a SQL like language to select, filter and output data from IIS logfiles. It is really amazing.



 4:36 pm on Apr 24, 2007 (gmt 0)


Thank you! I will definitely check it out.

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved