Forum Moderators: DixonJones

Message Too Old, No Replies

detect bots

         

stevelibby

11:52 pm on Feb 19, 2006 (gmt 0)

10+ Year Member



hi
i have created a simple enough clicktracker on my site, in simple, as well as other info it picks the Path_info which is supposed to tell me which page it came from. most of the time it works fine, however on an average of 50/50 it does come through as blank, i cannot understand why this happens as the referal is not a search engine but a page my site.
could it be crawling bots? also how can i get the bot info by collecting server variables?

ronburk

12:54 am on Feb 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



however on an average of 50/50 it does come through as blank

Assuming that by "Path_info" you mean the HTTP "Referer:" field, that field is set by client agents (e.g., browsers), if they feel like it. Non-IE browsers tend to give users the option of setting that field to nothing, to garbage, and to other things.

Also, a variety of different security packages may offer to strip that field for you, which will work even for the user that uses IE.

Finally, firewall or HTTP proxy software can likewise strip that field in the name of security. Some companies do this to enforce policies on all their employee browsing.

Like most web statistics, you should view the Referer field as a sampling of reality, not a 100% reflection of it.

also how can i get the bot info by collecting server variables?

Nice bots will set the user agent (or the referer field, or some combination!) field to something useful like "MyDumbBot: see [random.guy.com...] if you have questions." Nice bots will hit "robots.txt" before traversing your site, and will obey its directives.

Not so nice bots are not so easy to detect. But, usually they don't work very hard at being stealthy and don't have a lot of IP addresses to play with, so you can just watch for any IP address that fetches an unusual # of pages and then eyeball it to see if it looks like a bot.

One can also detect most not-so-nice bots by putting in a "hidden" (e.g., not very visible to human readers, like a hot-linked, 1-pixel white spot) hotlink to a special page that no humans would ordinarily find.

neonpie

9:44 pm on Feb 25, 2006 (gmt 0)

10+ Year Member



as ronburk mentioned - if you are using the HTTP Referer - in php and you are getting a blank result this could also be often a direct page access, so it would not come from anywhere else - i created a stats program b4 and if the http referer ='' then echo "direct page access"

hope that helps