| 8:45 pm on Dec 5, 2012 (gmt 0)|
Why do you think it appears legit? Anything amazon should be blocked with prejudice! They host hundreds of nasty little me-too bots. There are a couple (at least) threads hereabouts giving full IP ranges.
| 11:10 pm on Dec 5, 2012 (gmt 0)|
The Search utility at the top of the page is a great tool to verify if a topic has been discussed before. Often times, if members wish to add more info to a previously discussed topic that has now been closed, they will reference that specific discussion as a link in their post.
Amazon EC2 & its ranges may have been the most discussed topic within the Search Engine Spider and User Agent Identification Forum.
| 1:13 am on Dec 6, 2012 (gmt 0)|
I've just this instant (really) come from updating my shared htaccess*, and would have been awfully annoyed if it turned out I'd missed one. Gosh, what a lot you can shave off the filesize by putting multiple Deny-froms in a single line, as in
Deny from 1.3 1.8 18.104.22.168/14 22.214.171.124/13 1.45 126.96.36.199/14
Deny from 188.8.131.52/14 184.108.40.206/13 220.127.116.11/14 18.104.22.168/12
Deny from 22.214.171.124/14 126.96.36.199/14 188.8.131.52/15 184.108.40.206/14 220.127.116.11/13
Deny from 18.104.22.168/15 22.214.171.124/14
(Not AWS, obviously, but same principle. Cut the size by 1/3, and I didn't even consolidate very much.)
* Multiple domains in the same userspace. You can't do everything in a shared htaccess-- and some things you wouldn't want to-- but consolidating the mod_auth_whatsit is a huge timesaver.
| 1:25 am on Dec 6, 2012 (gmt 0)|
When I consolidated mine, it went from 36k to 19k. (but I also broadened many ranges to include smaller ranges, mostly Chinese.)
| 7:16 am on Dec 6, 2012 (gmt 0)|
When I said legit, I ment that they are attempting to look like maybe a yellow pages bot or something.
I don't mean to annoy anyone here with my posts but with a site with very heavy traffic I like to share the odd user agents that come along for those that are bot blockers.
You never know when it will come through some proxy or something.
| 8:28 pm on Dec 6, 2012 (gmt 0)|
I have my system arranged such that it checks not only the actual IP but any proxy IPs as well. If a proxy IP is in a blocked range (server farm, high-abuse-rate IP etc) then the actual IP is blocked as well: it usually means the idiot using the real IP has a virus that's turned it into a bot-net or some such.
An example of this occurred during the past few days - several different IPs came in with a variety of UAs etc but ALL with a single proxy IP that placed the requester at a UK server farm in the range 88.208.193.nnn. This is not the first time this server range has attacked via proxies: it happened several months ago as well.