homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

Human visits and robots visits
How to differentiate them?

 1:09 pm on Sep 20, 2004 (gmt 0)

Perhaps it is a very basic question, but how can I differentiate them without using tracking gifs with javascript or known robots lists?

I don't know if there is any parameter that can be considered related to a human visit other than the user agent. It is impossible to keep an up to date listing of browsers ua or robots ua (search engine bots, site downloaders, email harvesters, etc.).

Is here any trick to recognize the human visits? Perhaps calculating the time spent on a page?

Any input will help





 1:26 pm on Sep 20, 2004 (gmt 0)


What I do is three-fold: first I check for visitors that request 'robots.txt'; next I check the user agent (browser) for telltales such as 'bot' or 'spider' or 'crawler', etc; finally, I look for a very fast series of requests for pages.

Hope that helps,



 2:09 pm on Sep 20, 2004 (gmt 0)

Thanks Larry,

Looking for a fast series of pages could be a good one.

Perhaps checking if an IP requested more than 30 or 40 pages in one session could be another way?

So far the way o recognizing non human visits is:

1) If the ip checks for robots.txt
2) If the ua has certain keywords
3) The speed of requests
4) The quantity of requests in a single session

Any other?



 2:15 pm on Sep 20, 2004 (gmt 0)

The regularity of requests. If it's the same length of time between requests, its a bot, even if it didn't grab many pages.

Also, if it's grabbing images, includes, etc.


 2:30 pm on Sep 20, 2004 (gmt 0)

I recently stumbled across a nice little program by the name of WhosOn. It is meant to give you a real time snapshot of who is online and it can also issue some warnings for 404's and other stuff.

However, this program seems to use exactly the technique mentioned above. It indicates spider visits first by a list of known spider user agents, but then it has some heuristics which tries to detect previously unknown spiders. A dead giveaway is the request of robots.txt (even though I myself try to request it). Also not accepting cookies seems to be typical bot-behaviour.


 4:45 pm on Sep 20, 2004 (gmt 0)

Refusing cookies is another tell tale sign of robots.


 5:59 pm on Sep 20, 2004 (gmt 0)

That one is good, but is there any way to check if a robot is refusing cookies other than using cookies? I suppose not.

Do site downloaders (like teleportpro) refuse cookies?



 7:29 pm on Sep 20, 2004 (gmt 0)

Site downloaders accept cookies.

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved