I need a script to scrub log files of spider user-agents. I don't have much scripting experience but have to write it myself or find one at no cost.
I am using WebTrends on an enormous dynamic site and the spider traffic is still unbearable despite efforts with robots.txt. Since WebTrends charges based on page views, the spider traffic is really costing us.
WebTrends support (understandably) doesn't want to give me the info for free since they provide the service at a cost, but I was able to get this: The script must evaluate each line to see if it contains a certain value and delete that line if it does, and then go on to the next line. It is supposedly a very simple small script, but I have no idea how to write that.
Thanks jatar_k. This gives be a better idea of the direction I need to go in. I will look around for that spider database.
I don't think images are a problem because they don't charge up page views in WebTrends.