svrart - 12:08 am on May 1, 2013 (gmt 0)
Not sure if this is the right place, but here goes:
I am looking at my raw access logs and seeing that most of the "visits" are by bots. I want to filter such activity out to get a true sense of human visitors.
Filters for the biggies like Googlebot, bing etc are easy. However, there are numerous other small ones that come and go. One thing I have noticed is that these small bots, only download the landing page text but none of the related graphics.
So, I want to write a small php on my pages that looks at the get request and if it requests only text exclude it from my stats. How does one identify such text only requests. I know of variables such $_SERVER['REMOTE_ADDR'] but don't know how to get the text only part.