homepage Welcome to WebmasterWorld Guest from 54.198.157.6
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

This 75 message thread spans 3 pages: 75 ( [1] 2 3 > >     
What is this in my logs?
Repeated Odd HTTP GETs w/o CSS, Site Images, etc.
rrdega




msg:900018
 1:13 am on Aug 25, 2003 (gmt 0)

I'm curious, and gett'n a li'l irritated by this... Over last weekend, I moved my sites to a new (bigger, badder, 'n faster) server. Ever since then, I have been seeing this kinda stuff in my access log:

219.165.81.239 - - [22/Aug/2003:11:05:20 -0500] "GET / HTTP/1.1" 200 10476 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
168.58.181.22 - - [22/Aug/2003:11:09:27 -0500] "GET / HTTP/1.1" 200 10476 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
209.47.20.140 - - [22/Aug/2003:11:14:11 -0500] "GET / HTTP/1.1" 200 13311 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
4.60.72.164 - - [22/Aug/2003:11:19:40 -0500] "GET / HTTP/1.1" 200 13136 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
35.11.220.72 - - [22/Aug/2003:11:24:56 -0500] "GET / HTTP/1.1" 200 10898 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
219.167.81.202 - - [22/Aug/2003:11:25:24 -0500] "GET / HTTP/1.1" 200 10476 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
67.72.205.160 - - [22/Aug/2003:11:32:17 -0500] "GET / HTTP/1.1" 200 13318 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
67.35.113.36 - - [22/Aug/2003:11:34:21 -0500] "GET / HTTP/1.1" 200 10898 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"

It just goes on and on and on... Seemingly varying IPs, and no CSS or image GETs, so its not a real browser. I figure its something doing some sort of wget; Right?

Except, when *I* do a wget, this is what is logged...
67.65.136.57 - - [22/Aug/2003:11:39:46 -0500] "GET / HTTP/1.0" 200 13103 "-" "Wget/1.8.2"

Comments? Explanations? Any ideas how to stop it? Its suck'n bandwidth needlessly in my book!

Thanx!
-Bob

 

marcs




msg:900019
 6:03 am on Aug 25, 2003 (gmt 0)

Since you moved your site, you have a new IP address.

Maybe that (new) IP address had a lot of bots running against it and it still does. Just an idea. As you mentioned moving the site, it could have something to do with your new host/IP address.

jdMorgan




msg:900020
 6:19 am on Aug 25, 2003 (gmt 0)

Block them by %{REMOTE_ADDRESS} address if they are direct requests. If they are proxied requests, block them by %{HTTP:Client-IP}, %{HTTP:Forwarded-For}, or %{HTTP:X-Forwarded-For} as applicable.

These requests come from both Japan and the U.S. and from a variety of organizations. Look up the IP addresses using ARIN and APNIC, and see if they mean anything to you. Otherwise, they may just be open proxies that someone is using to monitor your for some reason.

Jim

rrdega




msg:900021
 12:10 pm on Aug 25, 2003 (gmt 0)

These requests come from both Japan and the U.S. and from a variety of organizations. Look up the IP addresses using ARIN and APNIC, and see if they mean anything to you. Otherwise, they may just be open proxies that someone is using to monitor your for some reason.
Actually, the sample I provided is just a very small snippet of the log. I just looked this morning, and since midnight, I've had 95 more of these hits, virtually all from unique IPs. (I copied out the block, and sorted it by IP, there are a couple of dups, but for the most part they are unique) So the addresses definitley mean nothing to me! Its a relatively new company/site, with a local market focus. So certainly nothing to do with Japan!

Since they are probably either spoofed IPs, or coming through an open proxy, that pretty much rules out blocking by %{REMOTE_ADDRESS}, %{HTTP:Client-IP}, %{HTTP:Forwarded-For}, and %{HTTP:X-Forwarded-For}, doesn't it? If not, then how about some pointers on how to implement such blocks?

Since you moved your site, you have a new IP address. Maybe that (new) IP address had a lot of bots running against it and it still does. Just an idea. As you mentioned moving the site, it could have something to do with your new host/IP address.
Yup! New IP... Same host, however, just new datacenter/server. Moved from DedicatedNow to Rackshack. I'm guessing these things are getting to me by IP Address, and not by my Domain Name. And yes, I'm also thinking they're bots of some sort... But what, and why? And how to stop it?!

Can the Log Format be changed to deduce if they're coming to me by way of IP directly, and not by Domain Name? Not sure what this might do for me, really... Perhaps have a new IP assigned, eh?

Still looking for ideas and directions to proceed to put a stop to this...

Thanx!
-Bob

jdMorgan




msg:900022
 3:03 pm on Aug 25, 2003 (gmt 0)

Bob,

> Since they are probably either spoofed IPs, or coming through an open proxy, that pretty much rules out blocking by %{REMOTE_ADDRESS}, %{HTTP:Client-IP}, %{HTTP:Forwarded-For}, and %{HTTP:X-Forwarded-For}, doesn't it?

If they are using an open proxy, you'll get the address of the open proxy in %{REMOTE_ADDRESS}, and you'll get the real IP address in %{HTTP:Client-IP}, %{HTTP:Forwarded-For}, or %{HTTP:X-Forwarded-For}, depending on the proxy itself. You can then block based on that variable.

If it was a spoofed IP, they would never see the response from your server, so doing a GET would be pointless.

> If not, then how about some pointers on how to implement such blocks?

You didn't say what server you're hosted on, and what capabilities it has. For .htaccess with mod_rewrite on Apache, you could call scripts based on code like the following:

This code passes anonymous proxy requests to a script that blocks proxies by IP number:

# Ban anonymous proxy requests
RewriteCond %{HTTP:Via} !^$ [OR]
RewriteCond %{HTTP_FORWARDED} !^$ [OR]
RewriteCond %{HTTP:X-Forwarded} !^$
RewriteCond %{HTTP:Client-IP} ^$
RewriteCond %{HTTP:Forwarded-For} ^$
RewriteCond %{HTTP:X-Forwarded-For} ^$
RewriteRule .* /cgi-local/anon_proxy.pl [L]

This code passes open proxy requests to a script that logs IP addresses of open proxies and the original requesting IP address. Having logged this information, you could then hard-code rules to block those client IP addresses:

# Ban open proxy requests
RewriteCond %{HTTP:Via} !^$ [OR]
RewriteCond %{HTTP_FORWARDED} !^$ [OR]
RewriteCond %{HTTP:X-Forwarded} !^$
RewriteCond %{HTTP:Client-IP} ^([0-9]{1,3}\.){3}[0-9]{1,3}$ [OR]
RewriteCond %{HTTP:Forwarded-For} ^([0-9]{1,3}\.){3}[0-9]{1,3}$ [OR]
RewriteCond %{HTTP:X-Forwarded-For} ^([0-9]{1,3}\.){3}[0-9]{1,3}$
RewriteRule .* /cgi-local/log_proxy_req.pl [L]

The code above is untested and intended as an example only - use at your own risk.

Jim

moltar




msg:900023
 3:20 pm on Aug 25, 2003 (gmt 0)

The strangest thing is that all UA are identical (unless you changed them).

I have a theory that it's your host manually writing random stuff into your logs to "waste" your bandwidth, to make you pay more.

If you have a fully dedicated server, then watch what is running in your process. Try changing the name of the log file. You can also put apache into a separate group (user) and let only that group (user) to write into log file, but let others read it.

ses4j




msg:900024
 7:09 am on Aug 26, 2003 (gmt 0)

I have the same problem! But I didn't move servers, my web site has been in the same place for 4 months. It just started around the 18th and has continued at about the same pace ever since. Here's a snippet of my logs:

220.209.79.183 - - [24/Aug/2003:04:30:50 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
207.40.108.234 - - [24/Aug/2003:04:34:47 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
210.144.176.44 - - [24/Aug/2003:04:35:54 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
61.214.27.145 - - [24/Aug/2003:04:52:58 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mo
zilla/4.0 (compatible; MSIE 5.5; Windows 98)"
66.61.124.28 - - [24/Aug/2003:05:02:30 -0400] "GET / HTTP/1.1" 200 9979 "-" "Moz
illa/4.0 (compatible; MSIE 5.5; Windows 98)"
64.158.237.171 - - [24/Aug/2003:05:06:06 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
195.10.104.217 - - [24/Aug/2003:05:17:48 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
128.59.65.156 - - [24/Aug/2003:05:27:03 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mo
zilla/4.0 (compatible; MSIE 5.5; Windows 98)"
24.27.109.132 - - [24/Aug/2003:05:30:59 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mo
zilla/4.0 (compatible; MSIE 5.5; Windows 98)"
66.171.41.213 - - [24/Aug/2003:05:31:51 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mo
zilla/4.0 (compatible; MSIE 5.5; Windows 98)"
199.111.64.206 - - [24/Aug/2003:05:41:07 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
213.58.11.250 - - [24/Aug/2003:05:46:57 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mo
zilla/4.0 (compatible; MSIE 5.5; Windows 98)"
220.99.116.174 - - [24/Aug/2003:05:47:16 -0400] "GET / HTTP/1.1" 400 307 "-" "-"
4.5.17.232 - - [24/Aug/2003:05:55:02 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozil
la/4.0 (compatible; MSIE 5.5; Windows 98)"
202.107.122.155 - - [24/Aug/2003:05:55:53 -0400] "GET / HTTP/1.1" 200 9979 "-" "
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
192.114.157.112 - - [24/Aug/2003:05:56:05 -0400] "GET / HTTP/1.0" 200 9952 "-" "
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
195.35.183.160 - - [24/Aug/2003:05:56:32 -0400] "GET / HTTP/1.1" 200 9979 "-" "M
ozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
219.167.234.155 - - [24/Aug/2003:05:58:30 -0400] "GET / HTTP/1.1" 200 9979 "-" "
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"

What can it be? It's already wrecked all my beautiful logs. Some kind of virus maybe?

Scott

rrdega




msg:900025
 10:15 am on Aug 26, 2003 (gmt 0)

Huh! The 18th is exactly when I first saw this... Bizzarre! Are you on a Rackshack server(IP Range) by chance? ...Looking for the common thread.

claus




msg:900026
 11:26 am on Aug 26, 2003 (gmt 0)

An unspecified amount of GET requests for the front page with an UA of IE 5.5 and different IPs.

The only sign of this being unusual is that the only thing fetched for each visit is the (html of the) index page, and no other stuff. Right?

If so, these IPs behave like a distributed/p2p SE spider. If that is right it is individual machines and not spoofed IPs. Although their users might not know that they are fetching exactly your page.

Did you get a request for robots.txt?

Grub comes to mind. It (normally) has the UA string of:

Mozilla/4.0 (compatible; grub-client-1.4.3; Crawl your own stuff with [grub.org)...]
Mozilla/4.0 (compatible; grub-client-1.3.7; Crawl your own stuff with [grub.org)...]

(and similar: 1.4.3 differ)

But, this bot also requests "robots.txt". At least it did last time it visited. And it also requests other pages than the index.

I am guessing that they might have (updated and) changed their UA-string. I've seen a decrease of visits from the above lately but haven't really thought about it until now. It's just a wild guess though and it might as well be another - the behavior does not seem identical.

Added: It still just as well might be no bot. Perhaps a popular link checker package uses this UA string by default?

/claus

[edited by: claus at 11:37 am (utc) on Aug. 26, 2003]

rrdega




msg:900027
 11:37 am on Aug 26, 2003 (gmt 0)

Thanx Claus!

Yes, it is still on-going... Upwards of 20 hits per hour, virtually all Unique IPs, fetching only the index page, and not the robots.txt (which I do have, and is retrieved/respected by legit bots), and always an identical UA...

I'm really very interested to see that someone else is getting this, and that it started at the same time as with me. And am now wondering if there may yet be more...

claus




msg:900028
 12:19 pm on Aug 26, 2003 (gmt 0)

In both cases we have recently established sites - four months is not the same as old, but not the same as brand new either. They will be appearing on lists somewhere, if nowhere else then in DNS records.

If we have found a new p2p spider here, the pattern is as expected. The clients would download lists to process and these lists will contain an amount of duplicates. This is done to ensure that the database/index will get updates although some clients are off-line.

In the beginning the amount of duplicates will be high and some sites will be hit hard as a result of this. This will (should) be tuned, as it's not in the interest of the index holder to get a large stream of duplicate feedback from the bots. It is not feasible to update the same listing 90 times a day, and webmastes will get mad at you. I would for sure, i've banned legit link-checking people for less. This one is hard to ban though.

Also, in the beginning, the index will be limited in scope. There's no need to spider the whole world until you have got a recently stable operation with not too much duplicate action taking place. That's a cost issue basically, so the economy will work to the benefit of webmasters eventually.

So, are there any other things in common apart from both sites being recent (four months or so?) - ".com" domain? both corporate sites? Same regional area?

I hope others will post as well, a sample of two is not enough. We can get a whole lot of wild guesses going in search for patters that way but not really conclude anything.

/claus

rrdega




msg:900029
 12:58 pm on Aug 26, 2003 (gmt 0)

Thanx for the interest, Claus!

I was beginning to think that maybe I was making a mountain out of a molehill... But I really think this is an issue that needs to be looked at! If there is a wayward p2p bot running amuk, it'd be nice to track it down, and get its code corrected, or to squash it.

In my case, the logging of these hits started the same day my new IP was set, and that was (I just looked back) on August 18th.

Note: My client accounts, moved to the same server but different IPs, are not experiencing this.

My latest log snippet for comparison is below...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218.114.66.207 - - [26/Aug/2003:05:35:29 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
218.188.109.61 - - [26/Aug/2003:05:39:07 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
80.2.247.233 - - [26/Aug/2003:05:43:49 -0500] "GET / HTTP/1.1" 200 10972 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
209.107.68.193 - - [26/Aug/2003:05:51:46 -0500] "GET / HTTP/1.1" 200 10136 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
159.153.176.45 - - [26/Aug/2003:06:00:51 -0500] "GET / HTTP/1.1" 200 17659 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
194.204.41.254 - - [26/Aug/2003:06:02:50 -0500] "GET / HTTP/1.1" 200 10972 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
203.197.138.163 - - [26/Aug/2003:06:06:17 -0500] "GET / HTTP/1.1" 200 10972 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
168.103.118.46 - - [26/Aug/2003:06:13:45 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
198.53.63.209 - - [26/Aug/2003:06:13:50 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
24.200.58.35 - - [26/Aug/2003:06:21:40 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
220.145.197.110 - - [26/Aug/2003:06:29:11 -0500] "GET / HTTP/1.1" 200 10551 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
211.119.97.9 - - [26/Aug/2003:06:30:30 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
81.99.198.18 - - [26/Aug/2003:06:31:18 -0500] "GET / HTTP/1.1" 200 10972 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
24.210.80.61 - - [26/Aug/2003:06:33:52 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
24.123.252.36 - - [26/Aug/2003:06:35:10 -0500] "GET / HTTP/1.1" 200 10972 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
67.115.130.202 - - [26/Aug/2003:06:37:16 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
192.12.3.99 - - [26/Aug/2003:06:46:27 -0500] "GET / HTTP/1.0" 200 17388 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
151.196.21.234 - - [26/Aug/2003:06:56:42 -0500] "GET / HTTP/1.1" 200 10551 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
66.124.164.208 - - [26/Aug/2003:07:00:33 -0500] "GET / HTTP/1.1" 200 14642 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
198.53.89.12 - - [26/Aug/2003:07:01:32 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
67.123.86.242 - - [26/Aug/2003:07:01:35 -0500] "GET / HTTP/1.1" 200 14642 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
151.205.166.109 - - [26/Aug/2003:07:07:48 -0500] "GET / HTTP/1.1" 200 14642 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
67.66.158.184 - - [26/Aug/2003:07:12:43 -0500] "GET /robots.txt HTTP/1.1" 200 250 "-" "Opera/7.11 (Linux 2.4.18-14 i686; U) [en]"
68.111.139.149 - - [26/Aug/2003:07:21:59 -0500] "GET / HTTP/1.1" 200 10972 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
61.207.84.111 - - [26/Aug/2003:07:22:39 -0500] "GET / HTTP/1.1" 200 12137 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
208.138.31.170 - - [26/Aug/2003:07:22:44 -0500] "GET / HTTP/1.1" 200 14963 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
143.60.100.167 - - [26/Aug/2003:07:28:47 -0500] "GET / HTTP/1.1" 200 13388 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
220.73.165.15 - - [26/Aug/2003:07:29:31 -0500] "GET / HTTP/1.0" 200 14732 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"

rrdega




msg:900030
 1:08 pm on Aug 26, 2003 (gmt 0)

Ooops! That one hit for robots.txt was me... I was over on grub.org trying to force it to read my robots file. Their script for this does not appear to be working, and I pulled it myself to make sure its accessable.

Other notes on Grub... It seems they have, indeed, had issues with adherance to the robots.txt file. However they do claim that their UA is properly listed, which in this case it does not seem to be. So I am not sure if it is Grub or not.

I still added the exclusion for the grub-client, however, though I do not believe it will help this situation....

I also tried to register for their forum (required in order to post) but to this moment have not received the confirming email needed to activate the account. Wonderful! What an excellent project! :o

[edit]Oh yeah... And for the Grub Forum, I cannot seem to locate an email link to the Admins! All they have is PM, and you have to log on to use that... Even more Wonderful![/edit]

claus




msg:900031
 2:40 pm on Aug 26, 2003 (gmt 0)

Well, this is what you've got then (post #12). I see no pattern apart from large IP blocks with lots of users, mostly ISPs:

[24.123.252.36] [ROADRUNNER-COMMERCIAL-MIDSOUTH] [US]
[24.200.58.35] [Le Groupe Videotron Ltee] [FR?]
[24.210.80.61] [Road Runner] [US]
[61.207.84.111] [Open Computer Network] [JP]
[66.124.164.208] [Pac Bell Internet Services] [US]
[67.115.130.202] [Pac Bell Internet Services] [US]
[67.123.86.242] [Pac Bell Internet Services] [US]
[68.111.139.149] [Cox Communications Inc.] [US]
[80.2.247.233] [NTL Cambridge - CABLE HEADEND] [UK]
[81.99.198.18] [NTL Infrastructure - Hersham] [UK]
[143.60.100.167] [The University of Virginia's College at Wise] [US]
[151.196.21.234] [Verizon Internet Services] [US]
[159.153.176.45] [Electronic Arts, Inc.] [US]
[151.205.166.109] [Verizon Internet Services] [US]
[168.103.118.46] [U S WEST Communications Services] [US]
[192.12.3.99] [Deep Eddy Internet Consulting] [US]
[194.204.41.254] [Radiolinja Eesti AS] [EE]
[198.53.63.209] [TELUS Communications Inc.] [US]
[198.53.89.12] [TELUS Communications Inc.] [US]
[203.197.138.163] [Leased line -- Anna University, Chennai] [IN]
[208.138.31.170] [Cable and Wireless Jamaica] [US]
[209.107.68.193] [Verio, Inc.] [US]
[211.119.97.9] [Jungseok apt.] [KR]
[218.114.66.207] [SOFTBANK BB CORP] [JP]
[218.188.109.61] [Hutchison Global Crossing Ltd.] [HK]
[220.73.165.15] [KOREA TELECOM] [KR]
[220.145.197.110] [InfoWeb(Fujitsu Ltd.)] [JP]

If it's a new p2p bot it must have some distribution backend. A popular piece of software or whatever.

I don't think it's grub. I can't really think of a reason that they should switch to a stealth UA string. Unless of course that their usual UA is banned too many places. I don't think so, as it does seem to behave reasonably well nowadays. And it's still spidering.

OTOH grub us not a very open operation in terms of information, i agree totally to that. Nobody really knows what they do apart from their own statement that the results are used by Wisenut, but the Wisenutbot still crawls anyway, and i personally have a really hard time finding grub-spidered pages in the wisenut SE.

/claus

rrdega




msg:900032
 4:01 pm on Aug 26, 2003 (gmt 0)

Well, this thing sure is a tzetze fly, in my opinion... No matter its intent!

And yes, if it is a new wayward p2p bot, it sure seems to have problems! Perhaps the reason it keeps coming back (from various IPs) is due to the fact that it never seems to get all of the index page. I just ran some 'curl -i -A" commands against my domain and IP, and it consistantly pulls about 17K, where this "thing" only seem to get like 10-14K.

Weird!

ses4j




msg:900033
 4:52 pm on Aug 26, 2003 (gmt 0)

I looked in my logs for the exact first occurence of a line like this, and it is, like I said before, on 18 Aug, immediately after midnight. Here's the first occurence, along with 3 others before I got a 'real' hit at 00:52:58.

218.63.191.199 - - [18/Aug/2003:00:12:18 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
218.108.145.211 - - [18/Aug/2003:00:47:50 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
218.18.43.50 - - [18/Aug/2003:00:50:06 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
61.171.34.150 - - [18/Aug/2003:00:52:11 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"

and a few more after a couple users came by:

211.149.111.134 - - [18/Aug/2003:01:08:15 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
218.18.133.232 - - [18/Aug/2003:01:25:15 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
218.2.181.183 - - [18/Aug/2003:01:26:11 -0400] "GET / HTTP/1.1" 200 9979 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"

I think it must be a worm of some kind that was distributed all over (especially in Asia) and just kicked on looking for vulnerabilities at midnight Monday morning, Aug 18. No?

ses4j




msg:900034
 12:46 am on Aug 27, 2003 (gmt 0)

I've looked all over the web for a lead on what this is. What I don't understand is, if half of Asia, not to mention the rest of the world, has some unruly bot or worm nagging my tiny little website, why can't I pick up a scent? It must be hitting many more sites then our two. Maybe people just haven't noticed - it doesn't use much bandwidth, just spikes views and visits...

Here's some stats:

Started Aug 18, first one at [18/Aug/2003:00:12:18 -0400]. Zero before then.

date - hits
-------------
Aug 18 - 194
Aug 19 - 363
Aug 20 - 382
Aug 21 - 363
Aug 22 - 381
Aug 23 - 125 (Saturday)
Aug 24 - 292 (Sunday)
Aug 25 - 421
Aug 26 - 401 as of 8 pm or so...

So it looks like it was gearing up Monday, took a half a break over the weekend, and started back up this week a bit stronger than before.

Scott

wkitty42




msg:900035
 1:04 am on Aug 27, 2003 (gmt 0)

the worst thing about it is that useragent being spoofed... no one wants to block based on that agent because it would knock out a large portion of viewers :(

keep digging and looking... sooner or later, something will be found that will show the true face of this entity...

curious




msg:900036
 2:28 am on Aug 27, 2003 (gmt 0)
Over the last few months, I've been getting similar single-line hits to my site as exemplified in the sample below. In contrast the examples previously discussed, the IP address in my logs is always the same as is the UA. The insult here is not that the logs look ugly but that we were being charged for the clicks on Overture (until we alerted them) while serving up zero info. Is this something altogether different?

64.23.0.81 [14/Jul/2003:05:11:28 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://addresssearch.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [14/Jul/2003:23:59:32 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://adultchat.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [15/Jul/2003:09:42:48 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://adultpersonals.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [15/Jul/2003:21:47:42 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://advertiser.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [16/Jul/2003:09:50:13 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://aerobics.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [16/Jul/2003:16:37:19 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://afb.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [16/Jul/2003:22:20:33 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://afc.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
64.23.0.81 [17/Jul/2003:11:58:29 -0400] GET /?ppc=overture&phrase=[search_term] HTTP/1.0 200 15836 http://affiliate.ca/exitpage/Advertising.php?term=[search_term] Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

etc., etc., etc...

[1][[b]edited by[/b]: heini at 10:40 pm (utc) on Aug. 27, 2003][/1]

wkitty42




msg:900037
 3:03 am on Aug 27, 2003 (gmt 0)

curious,

that ip in your details is a russian site...

08/26/03 22:55:15 dns 64.23.0.81
nslookup 64.23.0.81
Canonical name: u2.co.spb.ru
Addresses:
64.23.0.81

well, its hosted in the .ru domain, anyway...

the ip belongs to skynetweb...

08/26/03 22:59:03 whois!NET-64-23-0-0-2@whois.arin.net
whois -h whois.arin.net!net-64-23-0-0-2 ...

OrgName: SkyNetWEB, Ltd
OrgID: SKWB
Address: c/o SkyNetWeb -- 3500 Boston St. #231
City: Baltimore
StateProv: MD
PostalCode: 21224
Country: US

NetRange: 64.23.0.0 - 64.23.0.255
CIDR: 64.23.0.0/24
NetName: SKWB-UURID-554
NetHandle: NET-64-23-0-0-2
Parent: NET-64-23-0-0-1
NetType: Reassigned
Comment:
RegDate: 2000-04-25
Updated: 2000-04-25

TechHandle: HS1867-ARIN
TechName: Hostmaster, HOSTMASTER
TechPhone: +1-410-563-6484
TechEmail: sysadmin@skynetweb.com

don't know if this helps any or not...

wkitty42




msg:900038
 3:04 am on Aug 27, 2003 (gmt 0)

rrdega,

on your wget test, you can tell wget what useragent to use... when i use wget, i usually tell it to use a useragent that contains "turing machine" in it ;)

Wasa1234




msg:900039
 3:48 am on Aug 27, 2003 (gmt 0)

I'm getting the same stuff on my web site - started around 18th of Aug with 61 hits, ramped up on the 19th with 332 and has been hitting around 500+ a day for the last few days. The web site itself has been around for about 2 months prior to this.

All details are the same as what others have posted. The only page getting downloaded is my index.html (via the request for /)

All user agents are win98 / msie 5.5. They are comming from all over the globe.

Very wierd, I think... If anyone comes up with anything please post it :) The machines doing this have one common factor- they are all running win2k or above as they have ldap port 389 open. From the few I have scanned they also seem to have port 21 and 80 open with nothing actually listening (connection drops out after a little while)

claus




msg:900040
 7:35 am on Aug 27, 2003 (gmt 0)

Hi Wasa1234, welcome to WebmasterWorld :)

Thanks for the scan, i didn't even consider that option but now we have proof that it's a fake UA.

It might be a bot and it might be a virus/worm/trojan/whatever (on the requesting machines, that is. Don't panic yet ; ), nobody really can tell. By GETting the index page they only GET the server header information which they would be able to GET less suspiciously using the HEAD command - anyway, as they do make a lot of requests perhaps they hope this will just drown in the general traffic patterns.

Port 389 is ldap / Lightweight Directory Access Protocol as you said. In winXP it seems to be used for 389 Internet Locator Service (ILS) TCP in stead (two ways of stating the same?). I found a few leads on Google that showed it was related to the NetMeeting app (it uses ports: 389, 522, 1503, 1720, and 1731)

The port is open by default in WinXP, as ICF (Internet Connection Firewall) that ships with XP apparently hold these ports open: 21, 389, 1002, and 1720

If all requesting systems are winXP we have one extra common denominator apart from the fact that the target sites seems to be relatively recently established (two-four months or so).

/claus


added:

Found this, it's an old exploit from 1999 relating to MS Exchange Server 5.5 (email server that uses the same port): [packetstorm.icx.fr...]

That rings some bells. Recently there was a large amount of formmail-exploit-hunting going on, from several unrelated IP's, there's a few threads mentioning it.

Anyway, this exploit is unrelated. It does not explain any of the index GETting, as the mailserver is obviously not the webserver. And: it relates to the requesting machines, not the target websites.


added2:

Here's another exploit, this time taking charge of the target machine:
[inetsecurity.info...]

Still, the above examples only confirm that Win XP has flaws. It does not state anything about the sites that keep getting GET-requests.

I even found a recipe for a DOS exploiting port 389 on a few russian sites. I don't know russian and it's a C program, so i don't know the syntax (something seem to be missing anyway, although i'm not sure as i don't know that specific language), but here's two links to the same recipe - none has a date specified, so perhaps they're not new:

[netsecurity.r2.ru...]
[sector.h1.ru...]

I don't think it's this though, as this one does not do any UA spoofing as far as i can see, and it does not start slow either, it just launches. Besides, what's the point of launching against a few random relatively recent sites and pretending to be regular visitors... none afaik.

I still think it's some kind of badly configured script, although i'm open to other explanations.


edited a few parts
Wasa1234




msg:900041
 9:34 pm on Aug 27, 2003 (gmt 0)

I think I've found the cause - a virus known as welchi

URL below describes it -

[f-secure.com...]

excerpt -
"Another new RPC worm was found on August 18th 2003."
--- cut ---
"In addition, Welchi will attempt to infect IIS 5.0 web servers via WebDAV exploit. For more on this vulnerability found in March 2003, see:
[microsoft.com...] "

I guess what we are seeing is scans done on web servers trying to detect what they are running. As I'm an apache web server user all I'm seeing is the "GET /" and then the virus moves itself on further...

Opinions welcome :)

ses4j




msg:900042
 10:10 pm on Aug 27, 2003 (gmt 0)

Wasa1234:

I saw welchi when I was trying to investigate this too. But a couple things don't add up, and I can't find enough info on welchi to verify:

1) why are only a very few people noticing this? (maybe only we noticed because we have low-enough traffic that an extra 500 hits jumps out?)

2) why would welchi spoof it's UA? (for this I have no asnwer, although maybe it was just easier to code that way. on the other hand, since welchi is a 'nice' anti-virus-virus shouldn't it have been nice enough to ID itself properly so people could block it if they wanted?)

scott

rrdega




msg:900043
 10:23 pm on Aug 27, 2003 (gmt 0)

1) why are only a very few people noticing this? (maybe only we noticed because we have low-enough traffic that an extra 500 hits jumps out?)

I can say with certainty that its not universal. I have several client's whose site logs do not have any trace of this critter...

Wasa1234




msg:900044
 11:05 pm on Aug 27, 2003 (gmt 0)

I normally get about 5 hits a day so 500+ jumped out in front of me as serious :)

I picked on welchi for this one as the timing matched perfectly. It still could be anything but at least now I have something I can blame...

I don't think it's faking UA's in the normal sense of it - the virus is doing the scanning itself rather than using IE to do the scan, so it's probably hard coded.

Still, anything else people bring up is welcome ...

natch




msg:900045
 2:02 am on Aug 28, 2003 (gmt 0)

I'm getting this too, on Hurricane Electric (he.net). It's a new account. My domain name is not new, but it has always been very low traffic, and has never been advertised. There is no content on the domain now that would attract anybody. The home page contains a single word, the name of the domain.

Here's an example of the traffic.

65.95.246.181 - - [27/Aug/2003:18:20:13 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
172.191.101.210 - - [27/Aug/2003:18:20:28 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
203.115.194.33 - - [27/Aug/2003:18:21:47 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
12.24.248.70 - - [27/Aug/2003:18:21:54 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
64.244.110.188 - - [27/Aug/2003:18:23:24 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
15.246.143.45 - - [27/Aug/2003:18:24:14 -0700] "GET / HTTP/1.0" 200 159 "-" "Mozilla/3.01 (compatible;)"
69.0.98.97 - - [27/Aug/2003:18:24:55 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
150.198.138.127 - - [27/Aug/2003:18:25:35 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
64.211.208.246 - - [27/Aug/2003:18:28:19 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"
68.165.21.183 - - [27/Aug/2003:18:31:23 -0700] "GET / HTTP/1.1" 200 159 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)"

I was wondering about this, and did a google search for two phrases:

"GET / HTTP/1.1 200" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98"

That search lead me to this forum! I've never been here before, but managed to find you guys. You are not alone. I'm sure other people are also getting these hits but haven't noticed. They stuck out like a sore thumb in my logs because my traffic is so low to begin with.

These hits are definitely a problem, because if I want to make my home page bigger, they will consume a lot of bandwidth. The theory about the virus scanning for IIS servers seems like a good one.

I'm wondering what we can do about this. Blocking individual IP addresses is not going to be practical. I don't believe they are coming through a proxy, though that is just based on a hunch. I don't think the IPs are spoofed. This is going to be a big problem. Anyone have ideas?

Let's at least collect IPs for a few days to see if there seems to be some limit to the number of hosts. I doubt it, but will be interesting to see. I'll keep checking back here.

BlueSky




msg:900046
 4:03 am on Aug 28, 2003 (gmt 0)

I've been getting something similar, but not exactly the same. Mine gives a user agent of readwebpage on the "GET / HTTP/1.1" request. Then, it calls one page which is kinda buried in my site. Comes in from different IPs/ISPs. Even though I set it up to give 403's on that one page, they all still call that exact same one every single day.

Since it seems like a repeating pattern from the same 20 machines, I'm going to write to their ISPs and tell them to stop hammering my site.

ses4j




msg:900047
 8:55 am on Aug 28, 2003 (gmt 0)

Okay, here's some data on my visits from it.

From start of 8/18 until end of 8/27 I've had:

- 3405 visits from the "thing" (plus a few, I lost some time on Sunday cause my host rolled the logs over)

- 90 host-repeats (2.6%). Here's a couple of the dupe hosts, not that they're any different from the rest, but what the hey:
12.146.74.62
12.38.215.2
12.45.13.74
192.195.100.31
206.54.145.254 has hit me 3 times.
211.76.97.231 3 times.
208.249.243.131 4 times.
216.142.52.196 4 times.
220.73.165.208 5 times.
note: please don't read into the fact that these IPs are clumped or similar - I had it sorted in alphabetical order so the 1's and 2's (which were first) were the ones I pulled out.

- A very steady flow of them, after an initial ramp-up. From 8/18 midnight until about 8/19 at 2 or 3 AM, the flow was increasing, starting at about every 15 minutes, and since 8/19 I've been getting hit at an average 1 every 2-4 minutes.

If you want my whole dataset, just sticky me.

Scott

This 75 message thread spans 3 pages: 75 ( [1] 2 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved