|Hundreds of fake Googlebot hits from different IPs|
I run one of the hundreds of stupid "what is my IP" websites out there. For about the past month, I've been receiving about 500 hits per hour, from distinct IP addresses, using a miscapitalized Googlebot user-agent:
Mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)
Many of these seem to be from Russia, but enough have unique reverse DNS entries that suggests these sites may be hosting content. I suspect this is a botnet of some sort.
Any ideas on how to proceed?
Are you asking about the mechanics of blocking them? Or about deeper issues that involve direct contact with the offending sites?
There are two rules that almost all sites should have. Exact wording depends on your server type-- I assumed Apache, but you don't say --and then your chosen method. Apache, for example, generally does it in mod_rewrite but you could also do it in mod_setenvif.
Essential Rule 1:
UA is "Googlebot" (case-sensitive)
IP is not 66.249 or other legitimate G IPs.
Essential Rule 2:
IP is bing (the are lotsx of them)
UA is not bingbot/msnbot (OR: UA is MSIE-anything)
Some of your offending robots can probably be blocked by IP alone. Others need a UA block. In this case it would be "googlebot". Don't know about IIS, but in Apache I believe everything is case-sensitive by default.
Be careful in wording your rule. "Googlebot" is Capitalized, but www dot google dot com is lower case-- and it's contained within the UA string.
I could trivially block based upon the mis-capitalized User-Agent... I'm more concerned with how to contact the sites using my service that may be compromised!
welcome to WebmasterWorld, Clinton!
|I'm more concerned with how to contact the sites using my service that may be compromised |
are these requests all referred from other sites or are they direct requests?
These are direct requests. Quickly looking at the headers sent in the requests, the Referer header is not set. Neither are most other common HTTP headers set by normal clients (Accept, Accept-Language, Keep-Alive, etc).
These are almost certainly not coming from live humans or normal browsers.
in that case there isn't a "site" to contact.
you could do a whois on the IP address(es) and email the abuse contact for the ISP(s).
... but if the IP resolves to anything other than an established human ISP-- meaning that someone's running a bot off their home computer --don't hold your breath waiting for action. Just lock out the whole IP range. And hope you don't get too many from neighborhoods like 91. or 198.* where a /20 counts as a vast block.
* The latest trouble spot is 185. which had barely started being assigned when RIPE slapped down the /30 limit. Ugh.
Sounds like a botnet to me.
1. Contacting the 'abuse' address for the ISP is one route to take (may or may not get a response). May result in a few hosts getting fixed.
2. You could block their traffic. They would probably just find another service to do the same thing.
2b. They do use other services to do the same thing; you're just seeing some of the traffic.
3. You could return wrong addresses (127.0.0.1, 192.168.1.1, some internal DISA or MILNET reserved address, or cia.gov) -- that might cause some puzzlement for a bit, but same as #2.
Likely these IP responses (from your server) then get posted somewhere else (IRC, p2p network, some other compromised vps servers etc). No single ISP is going to care about it enough to do much than take care of their hosts.
There may be some interest by Interpol or a military/intelligence anti-cyberterrorism organization (or they might be causing it). I would cantact an organization like this... I remember some big botnet just got shut down in the past couple months, you could try whoever was involved with that.
Thanks for your insight.
|1. Contacting the 'abuse' address for the ISP is one route to take (may or may not get a response). May result in a few hosts getting fixed. |
1000 uniques over the past day. This won't scale!
|There may be some interest by Interpol or a military/intelligence anti-cyberterrorism organization (or they might be causing it). I would cantact an organization like this... I remember some big botnet just got shut down in the past couple months, you could try whoever was involved with that. |
I've contacted https://www.team-cymru.org/ about this, they may be able to assit.