homepage Welcome to WebmasterWorld Guest from 50.19.169.37
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 63 message thread spans 3 pages: < < 63 ( 1 [2] 3 > >     
Thousands of Spambot IPs Hitting my Site
Thousands of Spambot IPs Hitting my Site
spiritualseo




msg:4396671
 6:19 pm on Dec 11, 2011 (gmt 0)

Hey guys, need someone's help here. From the past week or so the bandwidth usage of my site increased from 1GB a month to 12GB a Day!

Awstats indicates that there are a range of unique IPs hitting my site and requesting thousands of pages with every visit. Most of these IPs seem to be originating from within the US which is funny. I have blocked China, Brazil and some other countries in my HTaccess but the hits continue.

Please take a look at these ips, these are just a few from thousands that hit my site almost every second. And each one requests around 1000 pages. My site has only around 100 pages so perhaps they request a page over and over again:

24.181.178.3
216.6.134.27
70.182.254.242
99.98.188.110
75.134.95.208
65.35.111.110
71.197.69.88
124.123.51.38
173.198.98.134
198.138.135.123
24.159.55.211
50.40.131.171
174.16.100.103
74.131.129.17
75.94.108.222
98.218.136.190
69.244.107.77
68.185.252.101

The funny thing is that all of them look unique and all of them have a verified DNS. How can this be?

Can anyone please explain what is happening? And what can I possibly do to stop this? If this continues, my site will go offline within a week or so.

 

Hope_Fowl




msg:4397943
 7:16 pm on Dec 14, 2011 (gmt 0)

I do wish that I could redirect them to a honeypot that their ISP would offer.

spiritualseo




msg:4398544
 10:09 am on Dec 16, 2011 (gmt 0)

I just found that google has removed the page which I blocked from serving 'empty referrals' from organic listings. And this page was the highest traffic generator to my site.

So blocking 'empty referrals' is definitely not a search engine friendly method. But what else am I supposed to do? There doesn't seem to be any other method to stop the attack.

keyplyr




msg:4398560
 10:44 am on Dec 16, 2011 (gmt 0)

IMO if Awstats is your only means of watching traffic, it does not give you enough raw data to effectively manage a site on today's internet. Can you get your server access logs? If not, I'd consider switching to a more obliging hosting company.

spiritualseo




msg:4398619
 1:12 pm on Dec 16, 2011 (gmt 0)

I do have raw logs but it does not give me any new data except the IP and the related user agent. Not sure how that can help or am I missing something here?

Here's a portion of today's logs, if I can post them here. I have replaced the page name with pagename.php. This is the page that I blocked for empty referrals which is why all the 403. I also see some IPs trying to access crossdomain.xml. Not sure what that is:

83.3.53.90 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.0" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; InfoPath.2)"

67.162.64.140 - - [16/Dec/2011:05:57:22 -0600] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FBSMTWB; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C)"

67.162.64.140 - - [16/Dec/2011:05:57:22 -0600] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FBSMTWB; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C)"

67.162.64.140 - - [16/Dec/2011:05:57:22 -0600] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FBSMTWB; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C)"

67.162.64.140 - - [16/Dec/2011:05:57:22 -0600] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FBSMTWB; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C)"

67.162.64.140 - - [16/Dec/2011:05:57:22 -0600] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FBSMTWB; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C)"

67.162.64.140 - - [16/Dec/2011:05:57:22 -0600] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FBSMTWB; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C)"

76.218.8.75 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

174.26.224.147 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

174.131.18.94 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

65.34.88.56 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"


71.176.233.232 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

75.2.251.130 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

173.2.191.74 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

99.6.211.63 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

71.56.133.69 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

98.175.1.1 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.2; Tablet PC 2.0; .NET CLR 1.1.4322)"

107.8.125.43 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

68.49.164.173 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

76.245.208.80 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET4.0C; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0E)"

24.22.29.38 - - [16/Dec/2011:05:57:22 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

69.181.239.252 - - [16/Dec/2011:05:57:25 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

76.24.67.153 - - [16/Dec/2011:05:57:25 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

24.208.14.142 - - [16/Dec/2011:05:57:25 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

71.61.126.174 - - [16/Dec/2011:05:57:25 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727)"

193.64.22.70 - - [16/Dec/2011:05:57:26 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 1.1.4322; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; AskTB5.5)"

76.170.17.150 - - [16/Dec/2011:05:57:26 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

98.116.166.2 - - [16/Dec/2011:05:57:27 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

74.243.97.188 - - [16/Dec/2011:05:57:27 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FunWebProducts; SearchToolbar 1.2; GTB7.2; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; BRI/2)"

69.180.106.58 - - [16/Dec/2011:05:57:27 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

109.207.48.93 - - [16/Dec/2011:05:57:28 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

68.101.97.149 - - [16/Dec/2011:05:57:28 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

66.94.198.82 - - [16/Dec/2011:05:57:28 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30618)"

75.31.24.95 - - [16/Dec/2011:05:57:29 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

71.56.133.69 - - [16/Dec/2011:05:57:29 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

91.13.59.238 - - [16/Dec/2011:05:57:31 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

90.202.62.16 - - [16/Dec/2011:05:57:33 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

68.186.164.14 - - [16/Dec/2011:05:57:33 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

117.207.21.87 - - [16/Dec/2011:05:57:34 -0600] "GET /pagename.php HTTP/1.0" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.2)"

124.123.51.38 - - [16/Dec/2011:05:57:35 -0600] "GET /pagename.php HTTP/1.1" 403 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

71.79.2.197 - - [16/Dec/2011:05:57:36 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"

98.192.157.239 - - [16/Dec/2011:05:57:37 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

98.167.229.15 - - [16/Dec/2011:05:57:37 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; InfoPath.1; .NET4.0C; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; BRI/1; .NET CLR 3.0.30729; BRI/2)"

189.58.190.106 - - [16/Dec/2011:05:57:37 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

75.44.38.4 - - [16/Dec/2011:05:57:37 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

68.1.70.175 - - [16/Dec/2011:05:57:38 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

76.218.8.75 - - [16/Dec/2011:05:57:39 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

68.101.64.217 - - [16/Dec/2011:05:57:40 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152)"

99.100.241.129 - - [16/Dec/2011:05:57:41 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; FunWebProducts)"

67.142.177.20 - - [16/Dec/2011:05:57:41 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

71.89.17.137 - - [16/Dec/2011:05:57:43 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

75.142.55.120 - - [16/Dec/2011:05:57:43 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

108.89.141.158 - - [16/Dec/2011:05:57:44 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

173.55.94.98 - - [16/Dec/2011:05:57:45 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

67.214.15.112 - - [16/Dec/2011:05:57:46 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

24.129.47.114 - - [16/Dec/2011:05:57:46 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

98.197.101.204 - - [16/Dec/2011:05:57:48 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

98.88.171.231 - - [16/Dec/2011:05:57:49 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; AntivirXP08; GTB7.2; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; AskTbOVO2/5.13.1.18107)"

70.250.176.35 - - [16/Dec/2011:05:57:52 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

117.207.21.87 - - [16/Dec/2011:05:57:53 -0600] "GET /pagename.php HTTP/1.0" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.2)"

71.181.44.222 - - [16/Dec/2011:07:24:41 -0600] "GET /crossdomain.xml HTTP/1.1" 404 7216 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

97.125.34.49 - - [16/Dec/2011:05:58:51 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:51 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:51 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:51 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:51 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

184.153.234.84 - - [16/Dec/2011:05:58:52 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6.3; FunWebProducts; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

174.59.235.130 - - [16/Dec/2011:05:58:52 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

97.125.34.49 - - [16/Dec/2011:05:58:52 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:52 -0600] "GET /style.css HTTP/1.1" 304 - "http://www.examplesite.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:53 -0600] "GET /style.css HTTP/1.1" 304 - "http://www.examplesite.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:53 -0600] "GET /style.css HTTP/1.1" 304 - "http://www.examplesite.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:53 -0600] "GET /style.css HTTP/1.1" 304 - "http://www.examplesite.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:53 -0600] "GET /style.css HTTP/1.1" 304 - "http://www.examplesite.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

97.125.34.49 - - [16/Dec/2011:05:58:53 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

77.255.246.133 - - [16/Dec/2011:05:58:53 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

97.125.34.49 - - [16/Dec/2011:05:58:53 -0600] "GET / HTTP/1.1" 200 16873 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.2; .NET CLR 1.1.4322)"

24.205.12.119 - - [16/Dec/2011:05:58:55 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

68.2.194.191 - - [16/Dec/2011:05:58:56 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

93.181.187.50 - - [16/Dec/2011:05:58:57 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

174.49.155.61 - - [16/Dec/2011:05:58:57 -0600] "GET /pagename.php HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.3061

spiritualseo




msg:4398649
 2:36 pm on Dec 16, 2011 (gmt 0)

Also, if I want to block all of these IPs manually, is there a way I can copy IPs from the above server access logs? I opened these logs using Wordpad, but copying these IPs one by one is hard. If there was a way one could copy just the IPs and remove the text, that would be great.

spiritualseo




msg:4398650
 2:43 pm on Dec 16, 2011 (gmt 0)

Okay, I found a way to do that. Copy and paste the entire file to excel and then keeping the data selected go to Data > Text to Columns > click next and then keep the space field ticked.

tangor




msg:4398651
 2:46 pm on Dec 16, 2011 (gmt 0)

As all these are returned as either 403 (a server denied file) or 404 file not found, these are not that horrible, nor should they result in a huge increase in bandwidth, UNLESS, the ips involved above are also scraping the rest of your site or you are offering custom 404/403 pages. Difficult to say what bandwidth might be involved, especially for the 40n and 304 entries...

We usually obfuscate the last part of the ip address, eg 123.456.789.nnn, when posting log files or ips (unless the rules have changed and I overlooked the memo).

spiritualseo




msg:4398687
 4:00 pm on Dec 16, 2011 (gmt 0)

Tangor, it's showing a 404 as I have blocked empty referrals to that page using htaccess. So the bot is not able to access that file.

I was wondering what that number next to the server response mean. For instance, in the below log, there is a number '7216' next to 404. Is that of any significance?

74.73.99.zzz - - [16/Dec/2011:07:24:45 -0600] "GET /pagename.php HTTP/1.1" 404 7216 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0;"

PS: Was trying to edit IPs to add the zzz to the ClassD but guess my editing time for that post is over.

tangor




msg:4398719
 4:38 pm on Dec 16, 2011 (gmt 0)

That number is the filesize sent in response to a request... my log returns a filesize for any status code... my 403 is 287 bytes my 404 is 365 bytes... Yours is 7k?

Run a log extract for a single ip causing concern and see how many different UAs are returned. Might help in narrowing things down a bit.

spiritualseo




msg:4398731
 5:03 pm on Dec 16, 2011 (gmt 0)

Thank you for that! Just realized that my 404 page was misconfigured, rectified now. And yes, I am running some tests now.

lucy24




msg:4398790
 8:08 pm on Dec 16, 2011 (gmt 0)

it's showing a 404 as I have blocked empty referrals to that page using htaccess. So the bot is not able to access that file.

404 isn't "blocked", it's "can't find". If a malign robot hits a 404 before it gets to the blocking stage, your error logs may show both: the original 404 followed by a 403 meaning that it wasn't allowed to see the 404 page. (No, it does not go into infinite redirect if it's not allowed to see the 403 page. It just goes to the Apache default.) But a custom page doesn't have to be very big. Just, ahem, 513K.

spiritualseo




msg:4398806
 9:31 pm on Dec 16, 2011 (gmt 0)

I found around 5000 offending IPs so far, but when I block it using HTaccess (order allow deny), it shows a 500 internal server error. Is there a limit to the number of IPs you can add to HTaccess? Works fine when I was blocking 2500 Ips.

lucy24




msg:4398850
 10:54 pm on Dec 16, 2011 (gmt 0)

It's much more likely to be a tiny little typo involving punctuation. Apache is very, very unforgiving.

But 5000? You're blocking too narrowly. Lock out whole ranges. Heck, lock out whole countries if you feel like it.

tangor




msg:4398852
 10:55 pm on Dec 16, 2011 (gmt 0)

That 500 error might be an upload error... needs to be ascii not binary!

You can reduce your IP entries significantly if you are blocking the "pagename.php" request with 403 and ignore the 404s (filter both out when doing stats), but only if those ips are NOT ALSO hitting other pages on your site in a bot-like manner.

For popular vulnerability test requests I 403 those and move on, then weekly run a "403" report of those ips and match them to the rest of the logfile... makes it easier to spot some bad actors, THEN I might issue ip deny from to block a range or range sub-set. The rest of the time I use UA and sometimes REFERER (sic) to fine tune. A much smaller .htacess file is the result (about 200 lines, including blanks).

On one site I don't use .php for anything so blocking that--alone--reduced the bad actors over 50%. Study your logs (excel's pivot report function is a handy tool!) and gather more data.

lucy24




msg:4398898
 2:49 am on Dec 17, 2011 (gmt 0)

On one site I don't use .php for anything so blocking that--alone--reduced the bad actors over 50%.

The rest of youse be careful: "I don't use php" doesn't simply mean "I don't have any pages with the .php extension". Notably auto-indexing uses php, for the rare case when it's appropriate to have it enabled at all.

Mine currently says
RewriteCond %{THE_REQUEST} \.(php|pl)
RewriteCond %{REQUEST_URI} !piwik
RewriteRule \.(php|pl)$ http://127.0.0.1 [R=301,L]

because I got riled and decided a simple [F] wasn't enough. (With some robots this alone has a wonderful effect, though these guys don't even seem to notice.) I have no idea what .pl is, and I'm sure I haven't got any, but our awstats friends seem to want it badly so let's lock 'em out on principle. Conversely, piwik's functionality includes a request for a php file-- it comes through in logs as an outside request even though there's no human involvement-- so I have to let them have it.

Technically it may not make any difference whether someone gets a 404 or a 403. It's more the "I don't like your face" principle. Especially when the same visit includes requests so outrageous, mod_security has to step in.

tangor




msg:4398899
 2:57 am on Dec 17, 2011 (gmt 0)

pl is Perl, and cgi is also Perl... add that to your extensions if you want to get 'em all :)

spiritualseo




msg:4398918
 4:40 am on Dec 17, 2011 (gmt 0)

What do you think would be the best way to block Funwebproducts? I found the following methods. I am using the 3rd method, but it doesn't seem to be working. What would you think is the best method?:

Method1:

SecFilterSelective HTTP_USER_AGENT "FunWebProducts" "deny,log"

Method2:
SetEnvIfNoCase User-Agent "FunWebProducts" bad_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot

Method3:
RewriteEngine on
RewriteCond %{HTTP_REFERER} FunWebProducts[NC]
RewriteRule .* - [F]

Pfui




msg:4398920
 4:59 am on Dec 17, 2011 (gmt 0)

I suspect there are errors galore in your htaccess/mod_rewrite code. At least here are two corrections in one line. (FunWebProducts is browser-specific and unrelated to referrers. Note, too, the space before [NC].)

RewriteCond %{HTTP_USER_AGENT} FunWebProducts [NC]
spiritualseo




msg:4398937
 7:00 am on Dec 17, 2011 (gmt 0)

Thanks you Pfui.

Seb7




msg:4398971
 11:00 am on Dec 17, 2011 (gmt 0)

I think funwebproducts is something that some users seem to get added to their useragent, never been a problem user for me.

keyplyr




msg:4398975
 11:33 am on Dec 17, 2011 (gmt 0)


I think funwebproducts is something that some users seem to get added to their useragent, never been a problem user for me.

I agree. Blocking "FunWebProducts" will lock out a hell of a lot of visitors. Bad idea IMO.

AlexK




msg:4399016
 4:40 pm on Dec 17, 2011 (gmt 0)

I am wondering if there is any way one can automatically block IPs (on Apache/Linux servers) that access a large number of files in seconds clearly indicating that they are not regular users or legit bots for that matter?


[webmasterworld.com...]

dstiles




msg:4399063
 9:47 pm on Dec 17, 2011 (gmt 0)

funwebproducsts has been around for a long time. It's not in itself injurious to web sites, as far as I know, but it is adware and calls home frequently so shouldn't be installed on ANY computer. I think it mainly gets installed because it offers lots of funny icons as an inducement.

I report funwebproducts in one of my security logs, just in case it gets nasty, but I can't say it's ever caused a real problem beyond user stupidity in other ways so I do not block on it.

Pfui




msg:4399110
 3:11 am on Dec 18, 2011 (gmt 0)

Ditto. I ignore it.

And now, from the FWIW Department...

Dunno about y'all, but we're a week and 50-plus posts into this thread and I'm no clearer on what's going on with the OP's situation than I was on Day One, just increasingly frustrated. The prob could be some small tangle, maybe even a loop, in htaccess, in a page's code, somewhere. Or in an already tangled mess of an htaccess. Regardless, it feels like we're getting nowhere.

So perhaps it's time the OP starts over at square one, with NO htaccess for five minutes? The traffic from the 'usual' bad bots getting through would be nothing compared to what the OP's reporting: "thousands that hit my site almost every second"!

Then:

-- If traffic continues at that maniacal rate with NO htaccess, I'd seriously consider pulling my own plug (/public_html~) and opening a Trouble Ticket ASAP... before my ISP pulls the plug for me.

-- And if traffic abates? I'd trash my entire htaccess and start anew, baby step-assembling its parts by hand, slowly, and only with understanding. No more copy-pasting from hither and yon.

lucy24




msg:4399119
 4:08 am on Dec 18, 2011 (gmt 0)

Looking back over the thread:
One thing I noticed is that this bot is requesting only one page on my site.

and
I just found that google has removed the page which I blocked from serving 'empty referrals' from organic listings. And this page was the highest traffic generator to my site.

So blocking 'empty referrals' is definitely not a search engine friendly method. But what else am I supposed to do? There doesn't seem to be any other method to stop the attack.


Have I got this right? A full-spectrum botnet-- random IP, random UA, no referer-- is hammering the site with requests for one specific page. Shared hosting, so you can't bring out the heavy artillery that only operates in the config file. In particular, you can't keep them from asking. You can only keep them from receiving.

A redirect probably consumes fewer bytes and less bandwidth than a 403.

Did you try this at some point?

RewriteCond %{HTTP_REFERER} ^-?$
RewriteCond %(HTTP_USER_AGENT} !google
RewriteRule delectablefile.html http://127.0.0.1 [R=301,L]

Where it says google, add the names of any other search engines that you absolutely can't afford to be without. One line, pipe-separated: !(google|bing|nanivara). Don't forget the ! exclamation mark. In a separate line, add yourself-- by IP, UA or whatever form is most likely to be unique. For the time being, you'll have to dispense with the humans who have got your site bookmarked.

wilderness




msg:4399125
 5:21 am on Dec 18, 2011 (gmt 0)

One thing I noticed is that this bot is requesting only one page on my site. Would it be possible to apply the 'block referrer' rule to this one page alone? This page is located inside a sub-folder.

Will something like this work, provided that I want to apply this rule to the page:

http://www.example.com/foldername/page.php

<IfModule mod_rewrite.c>
#Options +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^$
RewriteRule ^foldername/page\.php$ - [F]
</IfModule>


You were very close to solving your issue here and I'm disappointed that nobody picked up and responded.

Correct the above to the following:

#RewriteEngine on if NOT on previously; if on previously eliminate
RewriteEngine on
#require both blank UA and specific page request
RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{REQUEST_URI} foldername/page\.php
RewriteRule .* - [F]

then test your botnet in action.

You may be required to change the following line:
RewriteCond %{HTTP_REFERER} ^$

to
RewriteCond %{HTTP_USER_AGENT} ^[-]?$

then retest.

FWIW, redirecting denied visitors to an alternative page is a very unsound practice, and more-so for a beginner to htaccess and denials procedures. It should NEVER be advised to a noob, as many other practices by experienced users should NEVER be suggested to noobs.
Noobs should be given basic steps until their learning and practices within this forum progress to the point that simple procedures (i. e., anchors and a few other things) are second nature.

I would also (once again) state that this is only a stop-the-bleeding-action, and that you need to spend the necessary time in your logs determining the culprit that is the origin of this botnet.

I would also once again state that you should be providing some full logs lines (obscuring the Class D ranges, the domain name with example.com, and the actual page names with MyPage) for more reliable analysis and feedback from the participants of this forum.

2nd FWIW, these aforementioned denials are knows as "black listing", an alternative is "white listing" (denying all and making exceptions) which nobody whom has responded even suggested.

Don (signing off).

lucy24




msg:4399126
 5:52 am on Dec 18, 2011 (gmt 0)

You were very close to solving your issue here and I'm disappointed that nobody picked up and responded.

The OP himself did: see near the top of this page. He locked out the bad bots-- but also locked out google, with unhappy results.

What's the advantage of this
RewriteCond %{REQUEST_URI} foldername/page\.php
RewriteRule .* - [F]

over the OP's own version with
RewriteRule foldername/page\.php - [F]

?

Seems like the last thing you'd want to do is slow things down still further by forcing your server to evaluate every single request.

you need to spend the necessary time in your logs determining the culprit that is the origin of this botnet

Inquiring minds want to know: how?

redirecting denied visitors to an alternative page is a very unsound practice

But it's so satisfying ;)

wilderness




msg:4399130
 6:29 am on Dec 18, 2011 (gmt 0)

Lucy,
With all due respect, your little comments, meant to amuse me only confuse this thread and spiritualseo MORE.

spiritualseo needs to commit to an action, implore that action and stop accepting alternative methods (or even suggestion of same). The many alternatives are precisely what has made this thread so long.

Furthermore, another reason was this is my very last participation at Webmaster World.

spiritualseo, you have my email address, if you wish further help, than contact me.

Don

spiritualseo




msg:4399146
 9:46 am on Dec 18, 2011 (gmt 0)

I have currently blocked blank referral requests originating from Trident and Funwebproducts UA as suggested by Wilderness earlier, which seems to have hugely reduced the attacks while also being search engine friendly. Planning to block MSIE6 as well.

One issue I found is that the server returns a 500 server error instead of a 403 page for the blocked requests. Not sure why. But I am guessing that's not much of a problem.

I did spend time on the log files and accumulated over 60,000 IPs with 'blank referrals' that requested data in less that 15 hours. Eliminating legit IPs like ones from Google, and deleting duplicates left me with 5000 unique IPs. But I could not find many redundant Class Bs or even Class Cs (except a few) for that matter in this list of 5000 Ips. So the IPs follow a wide range making it impossible to block ranges.

A look at the user agents reveals that 95% are legit UAs having Trident and MSIE6 and 7. But mostly from Trident. Trident I guess represents IE8 and 9. So I am guessing these are from infected windows machines. Thanks for all your help!

Seb7




msg:4399171
 1:37 pm on Dec 18, 2011 (gmt 0)

spiritualseo, Blocking blank referrals and funwebproducts will block quite a number of real users. You could do with some dynamic blocking. Just block ips which are requesting too much.

shivachettri




msg:4414569
 11:29 pm on Feb 5, 2012 (gmt 0)

Hey there spiritualSEO, could you help me by dropping in the .htaccess code that you used for solving this. I too am having the very same problem and I am pretty new to this .htaccess stuff. It would really be very kind of you

This 63 message thread spans 3 pages: < < 63 ( 1 [2] 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved