Welcome to WebmasterWorld Guest from 18.104.22.168
Given the low traffic, one "visitor" stands out as unusual.
This visitor identifies itself as:
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)"
Each day it visits, starting with a "/" URI, then it fetches another dozen or so pages at the rate of 3-4 per second, clearly faster then a legitimate visitor.
The IP address is different for each visit. The last three visits used the following IP address's:
A check of the IP location etc reveals the following:
22.214.171.124 - UNITED STATES - DATABILITY SOFTWARE SYSTEMS INC - DERU.NET
126.96.36.199 - FRANCE - OPEN WEB SOLUTIONS - SERVERS.OWS.FR
188.8.131.52 - KOREA TELECOM - KORNET.NET
Has anybody else seen this? It looks like the IP adress and botname are false, so how can I block it?
Comments welcomed! If you need further information please ask.
I've seen spambots using MSIE's footprint but not your situation.
At 3-4 pages per second it obviously is a bot, another way to tell if they are pulling pages at a slower pace is if they never fetch any of the graphics elements of your page, particularly not grabbing your stylesheet. Lynx browser and other text-only browers won't fetch any of that stuff either.
If I didn't want to just blacklist their IPs I would write a script to detect '3rd page inside a second' *or* 'msie fetch 3rd page without fetching support files' and effectively 'greylist' the IP address of the visitor for about an hour.
Ps. Just blacklist their IPs.
Another IP address used today:
184.108.40.206 - UNITED STATES - LIQUID WEB INC - LIQUIDWEB.COM
I hesitate to use blocking by IP, as it seems they have a range of IP address's to use. Each day another one appears!
I liked your suggestion of time based blocking - I'll look at writing a brief PHP script for it.
I'm still curious about who they are, and what they are doing. I'd be interested to know if anyone else has had a similar experience.
You expect referrals from other servers (and almost always by domain name, not IP address), but you don't expect page requests from other servers unless a site that links to yours is running a script to check the validity of their out-going links.
Other than that, you can safely block any IP address range that resolves back to "servers" or "hosting."
For a requests-per-second-based blocking script in PHP, see Blocking Badly Behaved Bots [webmasterworld.com] (third of three parts).
Searching on Google for some of the addresses mentioned earlier i.e. 220.127.116.11 shows a post titled 'someone's scraping me' this page lists numerous addresses that seem to have been taken over by some malicious s/w that tries to inject URL's in to peoples logs/code.
It's possibly a game or an application that the same people have downloaded which infects their servers.
It seems quite a lot of people are seeing this but I haven't found an answer yet.
Blocking the IP addresses doesn't seem to be the answer as more and more PCs will become infected and the list would grow unmanageable.