Forum Moderators: open

Message Too Old, No Replies

Odd Useragents

         

Will

4:53 pm on Nov 27, 2001 (gmt 0)



I've been seeing a lot of useragents recently that appear to consist of a varying string of binary(?!) characters, always ending in "http".

Here are some examples:

? ? http
°Î°Îhttp
`#(http
Hß5http

Can anyone shed any light on what they are? I've tried decoding the strings (to no avail).

Since the IP addresses vary every time, I'm guessing that it is some application with characters in the UA that Windows cannot display properly.

Tapolyai

12:21 am on Nov 28, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have seen this when the character set on the spider is double byte, i.e. oriental.

littleman

4:36 am on Nov 28, 2001 (gmt 0)



Interesting, are the IPs out of Asia?

Will

10:21 am on Nov 28, 2001 (gmt 0)



IPs as follows:

24.11.101.170 cx377314-b.sking1.ri.home.com
65.11.237.204 cx262171-a.glstnbry1.ct.home.com
208.47.0.250 rhubarb.arl.qwestip.net
209.91.108.66 h-209-91-108-66.gen.cadvision.com
209.211.135.75 ipv6.arl.qwestip.net
213.106.214.10 inktomi1-bagu.server.ntl.com

Mixed bunch isn't it!

The inktomi/ntl type hostnames have cropped up before, but from IPs 62.25[2-4].* and with a referrer of synd.looksmart.co.uk.

bird

7:54 pm on Nov 28, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't see the "http" ending, but still the random control characters. Maybe there is a <newline> or <null> character in front of the "http" that causes apache to discard the trailing stuff.

There's no pattern in the requests, except that I have two such lines in my log where clients from two different continents requested the same page within the same minute. Could this be a Nimda/Sircam/whatever payload routine under remote control? Maybe someone figured out that they could use that technology for e-mail harvesting...

In any case, I'm currently looking into blocking access for non-ASCII UAs by .htaccess. I don't think they're legal in a conforming HTTP request anyway, even if the servers don't seem to enforce this so far.

Will

10:55 am on Nov 29, 2001 (gmt 0)



I'm not sure. Nimda doesn't request pages, rather it uses malformed URLs to try to compromise the server.

This is different - whatever it/they is/are requests valid pages from the site. It doesn't deep crawl - only one or two pages each time. Doesn't appear to revisit either. Definitely looks like a crawler of some sort (no REFERER field either).

TallTroll

11:31 am on Nov 29, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> 209.211.135.75 ipv6.arl.qwestip.net

Thats interesting. There are a few IPv6 networks out there. Could it be someone trying to build an SE on an IPv6 network? That might have an interesting effect, because I'm fairly sure that IPv6 packet headers are constructed differently to IPv4 headers. Could that cause this?

bird

3:50 pm on Nov 30, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> 209.211.135.75 ipv6.arl.qwestip.net

Could it be someone trying to build an SE on an IPv6 network? That might have an interesting effect, because I'm fairly sure that IPv6 packet headers are constructed differently to IPv4 headers. Could that cause this?

Not really. Someone happened to baptize his machine "ipv6", but the address is still IPv4.

There is no interaction between the transport level protocols of an IP connection and the application level protocols of a HTTP request. At least not until something is severely broken. ;)

TallTroll

4:23 pm on Nov 30, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> Someone happened to baptize his machine "ipv6"

Ya, I think this one is running a dual stack. Thing is I'm not a proper techie, I have just picked some bits here, and I'm not very clear on the differences between IPv4/IPv6.

Are the 2 totally incompatible? I thought you could use the "version" field to define which IPv type a packet is, and thus how the router handles it?

bird

5:12 pm on Nov 30, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, I think I just stumbled over the answer to this thread. From my logs of yesterday:

beetle.grub.org - - [29/Nov/2001:03:39:35 -0500]
"GET /some/dir/file_a.html HTTP/1.1" 200 4975 "-" "~d^U^L^H"
beetle.grub.org - - [29/Nov/2001:05:27:59 -0500]
"GET /some/dir/file_b.html HTTP/1.1" 200 5000 "-" "H¦ ^M^H^B"H"
beetle.grub.org - - [29/Nov/2001:08:20:56 -0500]
"GET /some/dir/file_c.html HTTP/1.1" 200 6761 "-"
"Mozilla/4.0 (compatible; grub-client-0.2.1; Crawl your stuff with [grub.org)"...]

[grub.org...] :

Grub provides a free for download, distributed crawling client, which is used to create an infrastructure (database + volunteers) that will eventually provide URL update status information for nearly every web page on the Internet.

While it's not technically a virus/worm, my suspicion of a remote controlled agent seems to get confirmed. Looks like an interesting approach to the crawling problem, though I'm not completely sure yet if it's really a good idea to do it that way.

But at least they're fixing their bugs once in a while... ;)

Will

9:14 am on Dec 3, 2001 (gmt 0)



I suppose it's one way to raise awareness if you're looking for investment!