Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- MSNBot has become a constant Fast-Scraper


AlexK - 1:46 pm on Dec 27, 2011 (gmt 0)


Another day, and a 10th MSNBot IP committing abuse upon my site:

2011-12-26 05:47:33 :: 207.46.13.144 [forums.modem-help.co.uk] :: max 12 pages / second


@Pfui:
Do not pay too much attention to some IPs getting stopped with a 403, and some with a 503. In brief, it works like this:

There are two tests:
1) Fast scraper (>= 3 page / sec)
2) Slow scraper (forums only; > 50 pages in 1 hour)

Only fast scrapers are reported.
Once spotted, a fast scraper is given a 403 block.
If a fast scrape continues for long enough, it will be caught first by the slow-scrape routine & given a 503.


Thread source:: http://www.webmasterworld.com/search_engine_spiders/4401159.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com