Forum Moderators: DixonJones
How do you guys seperate the spiders from the human users?
I really haven't got time to do a WHOIS on each IP to see where it originates. I assume the only way is to take an 'average amount of spider entries per day' figure and take it away from the overall visits. But is there really an average amount? I don't think so, it fluctuates so much.
How do you do it?
Cheers
Googly
However if you are counting daily uniques and have less than 1,000 a day you may find that even 50 of these may be spiders that dont really read the page. Not only SE bots, but email address scrapers, and heaps of other automated bots, increasing all the time. Certainly you should count only those pages which read the whole page, and not just the headers as well.
This is one reason why Page views can be such a rubbish statistic - our daily crawls from googlebot, news services accessing our rss files and scraping content, and the like account for sometimes 10% or more of our daily "page views" so we do filter them out.