Forum Moderators: DixonJones
87117
16 Mar, 07:40:53, 84-12-#*$!-184.dyn.example.co.uk, 282 pages
Windows 2003, Explorer 6.0
87119
16 Mar, 07:40:53, 84-12-#*$!-194.dyn.example.co.uk, 230 pages
Windows 2003, Explorer 6.0
87116
16 Mar, 07:40:52, host-84-9-XX-50.example.com, 262 pages
Windows 2003, Explorer 6.0
They use common user agents and are coming from personal ips' ip ranges. The providers in question have either denied the crawlers are using their service, or that there is nothing they can do about it.
Any ideas on how we can stop the offending crawlers?
[edited by: Receptional at 10:56 am (utc) on Mar. 16, 2007]
[edit reason] Anonymised /examplified the IPs [/edit]
Not sure how to go about it at present. No idea if you can do something like was suggested above with apache's httpd.conf or if it would have to be a php method.
The stat program that gave me the full details (bbclone), only records full details of the last 100 or so. When i get in Monday i'll take a full print of which pages they accessed that day and for how long so i can work out the pages per second block as Receptional suggested. :) The stat program that gave me the full details (bbclone), only records full details for a certain amount of previous visitors.
I just can't think of a way to determine it's them as they are using dynamic ip's.
If i track their pages over a few days i can see if they are looking at the same pages daily. Not sure how to detect frequency of page load (with php or something in apache conf, preferably).