Welcome to WebmasterWorld Guest from 54.158.36.59

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Thousands of Spambot IPs Hitting my Site

Thousands of Spambot IPs Hitting my Site

     
6:19 pm on Dec 11, 2011 (gmt 0)

5+ Year Member



Hey guys, need someone's help here. From the past week or so the bandwidth usage of my site increased from 1GB a month to 12GB a Day!

Awstats indicates that there are a range of unique IPs hitting my site and requesting thousands of pages with every visit. Most of these IPs seem to be originating from within the US which is funny. I have blocked China, Brazil and some other countries in my HTaccess but the hits continue.

Please take a look at these ips, these are just a few from thousands that hit my site almost every second. And each one requests around 1000 pages. My site has only around 100 pages so perhaps they request a page over and over again:

24.181.178.3
216.6.134.27
70.182.254.242
99.98.188.110
75.134.95.208
65.35.111.110
71.197.69.88
124.123.51.38
173.198.98.134
198.138.135.123
24.159.55.211
50.40.131.171
174.16.100.103
74.131.129.17
75.94.108.222
98.218.136.190
69.244.107.77
68.185.252.101

The funny thing is that all of them look unique and all of them have a verified DNS. How can this be?

Can anyone please explain what is happening? And what can I possibly do to stop this? If this continues, my site will go offline within a week or so.
9:34 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Botnets are very active at the moment, at least, on my server they are. Most botnet hits are unique - they switch IPs per page (in my experience).

And yes, a vast number of them come from USA, either from compromised "broadband" computers or from compromised server farms.

Also getting a lot of gootkit scans which look for a lot of specific PHP URLs (blocked a full /16 Egyptian range yesterday!) but although gootkit is a botnet "device" it seems to stay on a single IP until it gets fed up or firewalled.

Look through other recent postings in this forum for other comments on these phenomena.
10:00 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I just realized that most requests are direct and have no referrer. The referrer is blank. I used the following in my htaccess as of now and the spam seems to have completely stopped:

RewriteCond %{HTTP:Accept-Language} ^$ [OR]
RewriteCond %{HTTP_REFERER} ^$
RewriteRule .* - [F,L]

I know this will block quite a few legit users as well but is there any other solution to this? Also will this hinder search engine bots from crawling my site?

I also noticed that all these bots are requesting only one page on my site. This page is the largest page my site has. Is it possible to apply this htaccess rule to this single page alone and not the whole site?

Some of the most consistent UAs are as follows:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; FunWebProducts)

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) w:PACBHO60

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; FunWebProducts; GTB7.0; SLCC2; .NET CLR 2.0.50727; .NET CL

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; BTRS28059; SearchToolbar 1.2; GTB7.2; SLCC2; .NET CLR 2.0.50727;

I am not exactly sure if referring pages and requested pages are the same, but I think they are.

PS: the last three UAs are truncated. I am not able to copy the full text for some reason.
10:11 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You have located a solution to stop-the-bleeding, now you should spend some time in learning to understand htaccess and how to implement methods which will prevent other types of abuses to your site (s) in the feature.

As far as the innocents on refer?
Generally speaking its a bad idea to block blank refers.
Blank UA's on the other-hand are required blocking.

1) If you wish to make exceptions for IP's on your refer rule, just add IP lines to the conditions.

RewriteCond %{HTTP:Accept-Language} ^$ [OR]
RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{REMOTE_ADDR} ^123\.456\. [OR]
RewriteCond %{REMOTE_ADDR} ^456\.567\. [OR]
RewriteCond %{REMOTE_ADDR} ^567\.678\.
RewriteRule .* - [F]

1) Use the "[OR]" for multiple lines, and do NOT neglect omitting the [OR] from the last IP line.
2) Once you have this band-aid in place, than you'll need to go back and learn how to use regex for IP-Ranges as opposed to specific IP's.
11:03 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Do a quick check to see how many are from Amazon... look at this thread: [webmasterworld.com...] and this thread: [webmasterworld.com...]
11:23 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Can you explain about the "Accept-Language" element? I'm used to seeing it in discussions of redirecting visitors to language-specific pages. What's its role in robot blocking?

Incidentally, I recently added a "Files" exemption for favicon.ico. I did it to make it easier to identify humans in log-processing. (Robots who ask for the favicon do exist, but they're rare, so it alerts me to places where I may have blocked too enthusiastically.) It has the side effect of letting in g###'s faviconbot even though they refuse to put clothes on it :)
11:40 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Can you explain about the "Accept-Language"


lucy,
Don't believe it's relevant, rather just something that was included in the lines (i. e., thread) that he located and copied and pasted from.
11:44 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Do a quick check to see how many are from Amazon...


tangor,
Think your biting off more than your able to chew here.
Please see his initial thread in the Apache forum.

Your inquiry might result in enough IP's to keep you and everybody else (here at SSID) in IP's and IP-recognition for some while.
12:06 am on Dec 12, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



tangor,
Think your biting off more than your able to chew here.
Please see his initial thread in the Apache forum.


Dumping AmazonAWS into .htaccess has one line... little chewing involved. :)

Pick and choose battles, of course, but if one is facing real money costs in exceeding bandwidth, one responds as necessary.
12:10 am on Dec 12, 2011 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



tangor,
I was referring to his huge list of accumulated IP's.
12:27 am on Dec 12, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



wilderness... so was I! :) Some server farms are just icky and it is far easier to deal with a group ban than fiddle-pharting with narrow ranges. Most of my ip bans are either nnn or nnn.nnn I think I might have two which are nnn.nnn.nnn

Should traffic (human) fall off, I can always undo any of those ranges or fine-tune them, but for the most part I haven't had to do that. Again, pick and choose which battles. What works for me may not work for any one else.
5:12 pm on Dec 12, 2011 (gmt 0)

5+ Year Member



I did check for the amazon thing, but I am not able to find any. I am not able to establish any kind of redundancy in the IPs or in the User Agent for that matter. All of them seem unique. This has made it very difficult for me to block anything with certainty except for blocking empty referrals.

Here are a few user agents that I got hit from recently as I removed my 'blank referrer' block for a few seconds:



Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; GTB7.2; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30618; WinNT-PAI 26.08.2009; AskTbLMW2/5.13.2.19379)

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.24) Gecko/20111103 Firefox/3.6.24 GTB7.1

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; BO1IE8_v1;ENUS; InfoPath.1)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; MS-RTC EA 2)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SearchToolbar 1.2; GTB7.2; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C; AskTbPF/5.13.1.18107)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618)

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; WinNT-PAI 13.07.2009; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E; BRI/2)
5:25 pm on Dec 12, 2011 (gmt 0)

5+ Year Member



One thing I noticed is that this bot is requesting only one page on my site. Would it be possible to apply the 'block referrer' rule to this one page alone? This page is located inside a sub-folder.

Will something like this work, provided that I want to apply this rule to the page:

http://www.example.com/foldername/page.php

<IfModule mod_rewrite.c>
#Options +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^$
RewriteRule ^foldername/page\.php$ - [F]
</IfModule>
5:43 pm on Dec 12, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



You can apply any RewriteRule to a single page. In fact it's good for your site, because then mod_rewrite doesn't have to slow down and evaluate the rule plus conditions for every single request it receives.

And you can throw out the <IfModule container. You're dealing with a specific site, right? So either you've got mod_rewrite or (shudder) you haven't. And, whoops, you only need to turn the RewriteEngine on once. And / is the default base so you don't need that.

Express the Referer as ^-?$ because null referers (also blank UAs) generally come through as a single -
5:58 pm on Dec 12, 2011 (gmt 0)

5+ Year Member



Thanks Lucy24, so will this be it:

#Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^-?$
RewriteRule ^foldername/page\.php$ - [F,L]

By the way, will blocking blank referrers also blog genuine bots like the Googlebot? I tried fetching my site under Google webmaster central but Googlebot could not fetch the page. It was 403 forbidden. Is that an indication that Google cannot crawl my site?
6:55 pm on Dec 12, 2011 (gmt 0)



It will indeed, Googlebot very seldom presents a referrer string.
7:07 pm on Dec 12, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



One thing I noticed is that this bot is requesting only one page on my site.


If you're on Linux and have SSH access, this quick command will generate a nice temporary IP block list in seconds that you can drop in .htaccess until they go away.

grep "/attacked_page.php" -i access_log | egrep -o "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]" | sort | uniq | awk '{print "deny "$1}'

That'll spit out a nice deny list you can drop in .htacess and shut down the known bad IPs while you try to figure the problem out and get it sorted.
9:35 pm on Dec 12, 2011 (gmt 0)

WebmasterWorld Senior Member henry0 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



@incrediBill
Do you ref to spiritualseo problem, or are you generally speaking?
thanks.
9:42 pm on Dec 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



incrediBILL,

Won't that command also assemble IPs of users that navigate to some other page on the site from "/attacked_page.php", assuming the HTTP_REFERER is being sent/tracked?

Perhaps: grep "GET /attacked_page.php" ?
12:50 am on Dec 13, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Perhaps: grep "GET /attacked_page.php" ?


You're right, that's safer!

I just hammered out that line on my way out the door, forgive me ;)

Do you ref to spiritualseo problem, or are you generally speaking?


I use that approach any time I need a quick list of IPs out of access_log

In this case, he has an immediate need to stop a group hitting a certain page, works like a charm. If they were crawling all over the site, wouldn't work so well.
2:15 am on Dec 13, 2011 (gmt 0)

5+ Year Member



Maybe you should hide your site behind CloudFlare.com because they block spammers. You do have to tinker with log if you want to track IPs (they have a WordPress plugin).
2:38 am on Dec 13, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



you should hide your site behind CloudFlare.com

Exactly why many of us block them. Stop hiding.

ns1.cloudflare.com
173.245.48.0 - 173.245.63.255
173.245.48.0/20
2:48 am on Dec 13, 2011 (gmt 0)

5+ Year Member



you should hide your site behind CloudFlare.com

Exactly why many of us block them. Stop hiding.


What good does it do to block a service which blocks incoming spammers? Your server isn't likely to be bothered by CloudFlare's talking to the web servers with CloudFlare clients' content. Unless one of their clients is stealing your images.
2:53 am on Dec 13, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



What good does it do to block a service which blocks incoming spammers? Your server isn't likely to be bothered by CloudFlare's talking to the web servers with CloudFlare clients' content. Unless one of their clients is stealing your images.

It's got nothing to do with "blocking spammers" feature of CloudFlare. It's the unaccountability of cloud computing in general. It attracts lots of nefarious agents who wish to hide behind the cloud as they cause mayhem on our servers. Do some reading. Lots of documentation here in this forum and others at WW.

Note: This is getting off topic. Please start a new thread to continue.
2:56 am on Dec 13, 2011 (gmt 0)

5+ Year Member



How does one use the CloudFlare CDN for cloud computing? Maybe I need to learn how to do arithmetic with DNS servers.
7:53 pm on Dec 13, 2011 (gmt 0)

5+ Year Member



OK, back to the original topic.

Maybe you should hide your site behind CloudFlare.com because they block spammers. You do have to tinker with log if you want to track IPs (they have a WordPress plugin).


The free CloudFlare CDN service hides web servers from spambots, so is another possible solution to the original poster's problem of bandwidth consumption.
10:33 am on Dec 14, 2011 (gmt 0)

5+ Year Member



I had this last year on an IIS server, 1000s of the same page request every second from exactly 10 random IP addresses at a time.

Getting the server not to send a page helps. If you can get the server not reply at all (at a TCP/IP level) makes quite a difference on the server load.

The bot I had, I descovered it would action a 301 redirection, so redirected it to its own IP, which reduced the server load to practactly zero.

These sort of bots are getting very common, any large website needs some sort of load protection against this sort of activity.

Since then, I now slow or block any IPs which have gone over the threshold above being a very active user. If a single user is making your site unresponsive, then dont let all your other users suffer.
11:05 am on Dec 14, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



The bot I had, I discovered it would action a 301 redirection, so redirected it to its own IP, which reduced the server load to practically zero.

Ooh, must try that. Generally I just send 'em to 127.0.0.1 if they put me in a bad mood.

Quick look at htaccess suggests that a lot of bots have been making me grumpy in recent months.
2:43 pm on Dec 14, 2011 (gmt 0)

5+ Year Member



I don't have SSH access but thanks for the suggestion. I am wondering if there is any way one can automatically block IPs (on Apache/Linux servers) that access a large number of files in seconds clearly indicating that they are not regular users or legit bots for that matter?
4:20 pm on Dec 14, 2011 (gmt 0)

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Attempting to trap and block offenders will always be an ongoing game that you are forever in the catching up phase. Get pro-active about limiting potential damage. The best way to do that is to minimize the impact a botnet can have by reducing the overall size of your pages and serving up proper cache versions as required.

A side benefit to this is that your site will load faster, always a good thing.
This 63 message thread spans 3 pages: 63
 

Featured Threads

Hot Threads This Week

Hot Threads This Month