Forum Moderators: open
Please help with this puzzle:
I wish to create four classes of access:
- bot
- visitor, ie non-member
- member
- administrator
But, . . . visitors and bots don't login. So my system can't differentiate between them. Therefore, I am currently giving all visitors (including bots) full viewing access.
That's a problem for me.
Instead I wish to differentiate between visitors (eg: MSIE users) and bots, when they call the first page, so that the system can then manage their respective range and access.
How do I do it? How can I tell the difference between them?
(My web site <snip> is driven by PHP and Apache on Linux)
Look forward to your replies
Thanks
GrahamB
[edited by: volatilegx at 3:25 am (utc) on Jan. 22, 2006]
[edit reason] no URLs please [/edit]
Bad bots (those who will not obey your robots.txt directives), IP ranges, downloading tools and other user agents can be effectively controled using mod_rewrite and mod_access in your .htaccess file. For more info, see the Apache Web Server forum [webmasterworld.com].
You say:
- limit bot access in your robots.txt file
- deny access to deep pages, image files, anything you don't want bots to get
- see the Robots.txt forum
- Bad bots . . can be effectively controlled using mod_rewrite and mod_access in your .htaccess file.
- see the Apache Web Server forum.
Thanks
GrahamB
The main (and very quick) way that I distinguish bots from humans in order to save processing time on the 90% of hits on my sites that are from bots, is to check for a Referer (yes, one "r" in the middle) header.
No spider that I care about or that is responsible for much traffic for my sites sets it, and although some users turn off Referer for security, AND it won't be present on a type-in or bookmark hit, it IS present for me on almost all real human visits.
I simply make sure that the page is useful (and semantically identical; not black-hat "cloaking") in any case, and just a little faster to load, which is good for a first page (eg type-in) anyway, eg by omitting a background image and showing a cheaper-to-compute related-pages set.
Rgds
Damon
Spider Trap Msg#2
[webmasterworld.com...]
Updated PHP Bot script
[webmasterworld.com...]
Blocking Badly Behaved bots #3
[webmasterworld.com...]