Forum Moderators: open
Date of Abuse...... IP........... Rate
------------------- ------------- -----------------
2011-12-23 09:25:01 65.52.108.146 [forums.modem-help.co.uk] 12 pages / second
2011-12-23 01:05:57 157.55.16.219 [forums.modem-help.co.uk] 11 pages / second
2011-12-23 00:52:13 157.55.18.9 [forums.modem-help.co.uk] ..11 pages / second
2011-12-22 18:18:59 207.46.13.212 [forums.modem-help.co.uk] .7 pages / second
2011-12-22 04:02:38 207.46.195.240 [forums.modem-help.co.uk] 7 pages / second
2011-12-22 00:18:28 65.52.110.200 [forums.modem-help.co.uk] .3 pages / second
2011-12-21 04:32:16 65.52.104.26 [forums.modem-help.co.uk] ..9 pages / second
the odd Google IP & Yahoo! IP has occasionally got caught up in this net
2011-12-25 04:59:01 :: 66.249.71.26 [forums.modem-help.co.uk] :: max 3 / sec
Date of Abuse...... IP................................. Rate
------------------- ------------- --------------------- -----------------
2011-12-25 16:12:09 65.52.109.194 [forums.modem-help.co.uk] .8 pages / second
2011-12-25 15:35:55 157.55.38.162 [forums.modem-help.co.uk] .4 pages / second
65.52.108.146 - - [25/Dec/2011:03:01:07 +0000] "GET /page.php HTTP/1.1" 403 546 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:833 Out:528:63pct. "-"
157.55.16.219 - - [23/Dec/2011:20:11:46 +0000] "GET /page.php HTTP/1.1" 403 546 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:833 Out:528:63pct. "-"
157.55.18.9 - - [25/Dec/2011:04:02:26 +0000] "GET /page.php HTTP/1.1" 403 544 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:829 Out:526:63pct. "-"
207.46.13.212 - - [25/Dec/2011:04:03:28 +0000] "GET /page.php HTTP/1.1" 403 546 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:833 Out:528:63pct. "-"
207.46.195.240 - - [22/Dec/2011:09:16:55 +0000] "GET /page.php HTTP/1.1" 403 547 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:835 Out:529:63pct. "-"
65.52.110.200 - - [22/Dec/2011:01:59:12 +0000] "GET /page.php HTTP/1.1" 403 1346 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:2749 Out:1328:48pct. "-"
65.52.104.26 - - [22/Dec/2011:16:04:29 +0000] "GET /page.php HTTP/1.1" 403 545 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:831 Out:527:63pct. "-"
65.52.109.194 - - [26/Dec/2011:21:25:23 +0000] "GET /page.php HTTP/1.1" 403 547 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" In:833 Out:529:63pct. "-"
157.55.38.162 - - [25/Dec/2011:15:35:55 +0000] "GET /page.php HTTP/1.1" 503 170 "-" "msnbot/0.01 (+http://search.msn.com/msnbot.htm)" In:- Out:-:-pct. "-"
egrep -c '^157.55.38.162' /var/log/httpd/access* | awk -F: '{SUM+=$2}END{print SUM}' 65.52.108.146 :: 8409
157.55.16.219 :: 17107
157.55.18.9 :: 5799
207.46.13.212 :: 9949
207.46.195.240 :: 5611
65.52.110.200 :: 2579
65.52.104.26 :: 6322
65.52.109.194 :: 2901
157.55.38.162 :: 125
2011-12-26 05:47:33 :: 207.46.13.144 [forums.modem-help.co.uk] :: max 12 pages / second
Have you used the Bing Webmaster Tools specifically designed to allow you to control the rate at which we crawl your website?
tangor, I do not think that you have thought this through. I've been stopping abuse from bots for several years now, and reporting it for 18 months. Here are some recent facts from the last year for you to consider:
2011-12-29 04:22:22 :: 65.52.109.152 [forums.modem-help.co.uk] :: max 5 pages / second
Most of my response was to your declaration of a `12 pages/sec' index being OK, and even apparently belittling the idea that to crawl at that rate may be abusive. I therefore thought that I should add a little more substantia to my claims, and at the same time highlight an on-coming issue that I've seen little commented-upon elsewhere.
The thought that all SEs will take your attitude as the green light to employ such behaviour worldwide makes me shudder
you who are giving the SEs the green light by not having a 'crawl-rate' in your robots.txt
It [crawl-rate directive] was there for all my sites for many years. They ignored it. In the end, I gave up & removed it. No point in a directive that not a single SE--including MSN, who originated it--followed.
if individual IPs are exceeding specified limits