Forum Moderators: DixonJones

Message Too Old, No Replies

Is this Alexa bot?

A robot coming from Alexa networks with strange UA string?

         

vdoyl

10:13 pm on Aug 18, 2005 (gmt 0)

10+ Year Member



Hi,

I've recently noticed the following in my logs:
======
Host: 209.237.238.18
Http Code: 200 Date: Aug 18 13:44:28 Http Version: HTTP/1.0 Size in Bytes: 114570
Referer: -
Agent: \xe0\x0e\xc9\x11\b\xa5i\x12\x10
======

Does anyone know what this UA string means, it looks that it is using some kind of encoding?

The IP address resolves to the Alexa network, but why they are using such strange UA string?

Furthermore, the UA changes for each bot, even for subsequent requests from one and the same bot for example:
Host: 209.237.238.178

Agent: \x10\x10*\x12`i\x06\r(\x12
....
Agent: \x90\x94\xf9¦\x88\xaf(\x83\x10

Thank you!

Dijkgraaf

11:22 pm on Aug 18, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, I noticed this one as well.
Sometimes is also has a blank User Agent or just a single character.
Usually the UA I get from that IP address range is ia_archiver.
Another difference is that it is fetching pages with paramters which the ia_archiver never did.
Where it is getting those URL's from is a bit of a mystery though, as so far as I know it hasn't spidered the pages leading to those pages.

victor

6:31 am on Aug 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It started a few days ago on one of my sites. Symptons as dscribed by you.

It seems pointless behavior of the part of Alexa. Crying out for a little mod to ban any bot with unprintable characters in their name.

dcrombie

9:26 am on Aug 19, 2005 (gmt 0)



I've seen a lot of those user agents from 209.237.238.17* (Alexa). Thought it might be Unicode characters, but now it's looking like it might be machine (shell) code.

Here's some samples from today:

8\x03\xdd}P\xfd\x0c\x88\x10 
@\x9e\xea\x88\xd8\xce$~\x10
P\xb1\x94\x99 \xac\xc5\x9a\x10
XP\x0e\x9a0AK\x9b\x10
\x10\x13\xb9\x9a8G\xe2\x9a\x10
\x18*[\x8dp\r\x1d\x8f\x10
\x80\xe7\x8b}`Q\r{\x10
\x88\xefx}\x80\xb8\xa2\x9d\x10
\xa0\xbd\xd5\x98t
\xa8\\N\x99P\xb1\x94\x99\x10
\xc0\xbc\xc8sXq\xaf\x8f\x10
\xc83\xcd~\xd0\xa3\x1a\x9a\x10
\xc8\x06\x9d\x8eH\xa35\x8e\x10
\xc8ed\x8fx\xbd\xc2\x8e\x10
\xd8\x1e\x18\x88\x80\xf6\t\x87\x10
\xec@\x8cp\r\x1d\x8f\x10
`L\xfe\x9aP\xdaTy\x10
h9\x13\x80h\xc4v~\x10
p\x82\xa1\x7f\xa0\xf9\xd5~\x10
p\xd8\x10\x9a\x88\xefx}\x10
po\xf9\x9a0\xe17\x9b\x10
x\xbd\xc2\x8e8\x90e\x8e\x10
x\xd2r}t

Anyone...?

Dijkgraaf

12:31 am on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Looks like they might have fixed the issue now, as it the lastest entries from that IP range now have ia_archiver as the User Agent again.