Forum Moderators: DixonJones

Message Too Old, No Replies

Differentiating human visits from robots'

Which is the most accurate way?

         

eflouret

6:58 pm on Sep 22, 2003 (gmt 0)

10+ Year Member



Hello,

Perhaps it is very basic and it has been discussed, but couldn't find it here after several searches.

I have several database generated sites with several thousands of pages each. At this point web bots visits are a very significant number and for that reason I cannot get accurate page views stats.

I wanted to create a simple javascript and php or perl counter that can write pageviews to a text file. Only hits to a page (just the date and time of the hit) so I can query that text file for near accurate page views (as far as I understand, there is no such thing as totally accurate page views).

Am I going in the wrong direction? Is there any log analizer that can differentiate most web bots (including adsense googlebot). I use Funnellweb and I like it much, but its bots list is extremely small.

Any input will be helpful

Enrique

jeremy goodrich

12:09 am on Sep 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Anything that uses a javascript method to log a visit to a site will only track people that can execute the javascript, so naturally this method will miss some regular visitors, but would not include bots.

Also, bots don't catch & serve cookies, so if you can cookie each unique visitor you have, and then read the cookie data on a successive page view, you'll more accurately tell 'people' traffic from bots that way, too.

cfx211

4:58 pm on Sep 23, 2003 (gmt 0)

10+ Year Member



Why don't you just exclude known spiders in your query of the text file? All reputable spiders include their name in the user agent field, so you can just add user_agent not in (list of all spiders user agents). There are lists of known spiders user agents out there that you can get a hold of fairly easily.

eflouret

6:21 pm on Sep 23, 2003 (gmt 0)

10+ Year Member



Thanks everybody for your replies.

The cookie method is interesting, but which is better?

There are a number of users that choose to disable javascript. But also there are a number of users that choose to disable cookies.

Which is the highest number?

Also, do I have to place a privacy statement at my site for placing cookies at my visitors machines?