Welcome to WebmasterWorld Guest from 18.104.22.168
I am using WebTrends on an enormous dynamic site and the spider traffic is still unbearable despite efforts with robots.txt. Since WebTrends charges based on page views, the spider traffic is really costing us.
WebTrends support (understandably) doesn't want to give me the info for free since they provide the service at a cost, but I was able to get this: The script must evaluate each line to see if it contains a certain value and delete that line if it does, and then go on to the next line. It is supposedly a very simple small script, but I have no idea how to write that.
Can anybody advise me on this?
read line from logs
compare ip with db
if no match then write to file to send to webtrends
if match then read next log line
if this is a huge file then it could take quite a while to finish
we used to scrub image requests et al too (not sure if that matters for you or not), for an example of the apache conf we used see here
I don't think images are a problem because they don't charge up page views in WebTrends.