homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Awstats Scrapers / Injectors

 11:19 pm on Dec 12, 2011 (gmt 0)

I'm seeing a LOT of hits across my whole server for awstats folders. Not successful because I do not publish those folders (who does?!).

UA seems invariably to be:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0

URL: example.com/awstats/awstats.pl?configdir=|echo;echo%20YYYAAZ;uname;id;echo%20YYY;echo|

The vertical bars and echos suggest to me they are trying to inject code into the file but my perl is long past so not sure what it's really doing.



 3:49 am on Dec 13, 2011 (gmt 0)

This new-old, log-bloating exploit showed up all of a sudden on 12-03. I even filed a report with amazonaws.com because the first hacked machine that hit me was one of theirs. O, joy.

Same UA, same usual awstats stuff, plus PHP misc. like:


Multiple IPs hit since 12-03 but the exploit's not hitting IPs sequentially like ZmEu and its ilk.

Also atypical is that multiple hacked machines have repeated the 40-line attack more than once, thus earning them their own Firebox killfile rules.


 4:10 am on Dec 13, 2011 (gmt 0)

I'm seeing this from 40 - 50 different IPs, some hitting thousands of domains. Googling for "awstats.pl configdir exploit" shows that particular exploit as dating back to 2005.


 6:29 am on Dec 13, 2011 (gmt 0)

I'll be ###. I was thinking of posting about that. But with a UA like "Mozilla/ 5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/ 20100101 Firefox/ 8.0" (disregard the spaces-- I'm just pasting from processed logs), "user agent identification" is hardly an issue. It's so calculatedly vanilla, you can hardly block them; I get human visitors with the identical UA every day. Is it even worth figuring out whether it's spoofed, or some vulnerability in this particular setup?

I had four yesterday (Sunday)-- three of them about an hour apart, and one more in the evening. Two or three in the other domain that shares my userspace, and the same in my son's domain, different userspace (don't know if it's the same server).

Two of my four were from ranges in China that I'd already blocked. Half of each visit got blocked simply by asking for php. Each visit included a lone 401-- a nonexistent file inside a password-protected folder. The rest was 404... except for this interesting series, only attempted by one of the four (again, disregard spaces):

/?file = ../../../../../../proc/ self/ environ%00 501
/?page = ../../../../../../proc/ self/ environ%00 501
/?mod = ../../../../../../proc/ self/ environ%00 501
/index.php ?option = com_simpledownload &controller = ../../../../../../../../../../../../../../../proc/ self/ environ%00 501

Notice the %00 at the end? I normally disencode everything with percent signs, but those had to stay to keep the text editor happy. Doesn't seem to have made the server very happy either.

But I was almost thinking of posting in the php forum just to see if anyone could translate what these folks are trying to do. In the assorted nonexistent .pl files, they wanted to

?configdir = |echo;echo YYYAAZ;uname;id;echo YYY;echo|

In the equally nonexistent php files, they wanted to:

?cmd = setquality &var1 = 1'.passthru('id').';

and then

?sort = {${passthru(chr(105).chr(100))}}{${exit()}}

Anyone know what that means in English, beyond "something not nice"?


 9:06 am on Dec 13, 2011 (gmt 0)

It's just a test to see if the system can be tricked into running a command. The commands don't do anything more than say what user id the command gets run as, and identify the system. If it works, the hacker will go back and run something useful.

"id" says what user is running the commands, and "uname" identifies the type of system.

The last one uses chr(105) and chr(100) which translates back to "id" again, so it's the same thing.


 11:09 am on Dec 13, 2011 (gmt 0)

Lucy you can translate chr(number) by holding down the Alt key and typing zero number on the numbers pad.

chr(105) = Alt 0105 = i


 9:48 pm on Dec 13, 2011 (gmt 0)

Well, not exactly, although the Option key does now have a tiny little "alt" as an, er, alternative label. But I could do the same thing I do with percent-encoded urls, minus the x.


html preview >> id

When I woke up this morning it occurred to me that if you can do this stuff with php there is presumably a jsp equivalent-- and there goes my favorite site, which has an absent webmaster combined with security loopholes you could drive a truck through. Brrr. At least it's not shared hosting, where one breach could get you into hundreds of sites.


 10:51 pm on Dec 13, 2011 (gmt 0)

btherl - thanks for the explanation! :)

Lucy - trap on awstats (and, of course, config).

If you have ANY logs open to the public (ie in the web root or deeper) then it's way past time to do something about it. At the very least it's open to SE indexing.


 2:41 am on Dec 14, 2011 (gmt 0)

trap on awstats (and, of course, config)

You do realize, I hope, that this is so much Hungarian to me ;) I did send a heads-up to my host. And then I adjusted my htaccess to add .pl to the existing .php block, and for good measure changed it from [F] to

And then... As usual after fiddling with htaccess, I waited a bit and then took a look at logs to make sure that :: cough, cough :: error logs weren't suddenly ballooning out of all reason, with timestamps always matching the Access Log. Ahem. This led me to following, from the most recent set of visits. I've changed all the brackets to line breaks for readability, but it's otherwise exactly as mod_security said it:

ModSecurity: Access denied with code 501 (phase 2).
Pattern match "\\./proc/self/environ" at ARGS:file.
file "/dh/apache2/template/etc/mod_sec2/mod_sec.conf"
line "5"
msg "/proc/self/environ access"
data "./proc/self/environ"
severity "CRITICAL"
hostname "www.example.com"
uri "/"
unique_id "{buncha alphanumerics here}"

I don't mess with mod_security, so this is stuff that happened at the server's config_file level. Found similar stuff in my son's error logs-- and he's even smaller* than I am. It comes in packets of four per visit.

* That is, er, his site's smaller. He's grown up, but didn't even have an .htaccess or robots.txt-- or custom error pages-- until I sneaked in and made some for him.


 10:55 pm on Dec 14, 2011 (gmt 0)

Sorry, can't help with that one. I don't do htaccess/apache. :(


 6:26 am on Dec 18, 2011 (gmt 0)

:: mild bump ::

Anyone see any change? Sometimes they take a day off, but it's still running 1-2 cycles a day. My art studio site, which is smaller than I am*, once got zapped five times on a single calendar day.

Pfui, did you count 40 lines or is that just an estimate? With rare exceptions, mine go in sets of 25. Always the same 25 in the same order. Generally in and out within 10 seconds, though one of them spread himself out and spent five minutes.

I don't think these robots are talking to each other, either. Normally if you get a 404 you don't come back again and again on the off chance that the file has suddenly come into existence. Unless, ahem, you're the googlebot looking for a file that you saw once in 2003.

Yesterday they came up with a head-scratcher.


immediately followed by a simple and straightforward


Ugh, I hate to see the bad guys walking off with a 200. (I had to test this myself with Live Headers. If you don't use queries, then apache simply ignores the query string and sends you to the page.)

One time they proceeded from there to


which sound like just the kind of places a bad robot would enjoy.

But I don't get the awstats part. It's just a log-crunching program, right? And your logs don't store anything tasty like credit card numbers. At most, there might be a sessionID. Are they collecting human IP addresses, something like that?

* Like the man said, smaller than me is not easy.


 2:02 pm on Dec 18, 2011 (gmt 0)


1.) I'm seeing multiple variations of the same kind of exploits, crusty old zombies from newly zombified machines. [en.wikipedia.org...]

2.) While technically robots, these kinds of exploits are much, much more than just 'bad robots' in the usual web/SE scheme of things. They could care less about our pages per se or whether they get a 403 or 302 or whatever: It's all about breaking into and orchestrating machines.

Exploit-probing programs hunt for Achilles' heels by which to gain control of machines they then use to set up shop, reproduce and relay more exploits, spambots, and worse... [seattletimes.nwsource.com...]

3.) AWStats is just one of (probably tens of) thousands of programs with a history of holes [awstats.org...] [securiteam.com...] -- widely used programs that get installed, don't get updated/stay flawed, have predictable installation paths and thus offer predictable, reproducible, widely exploitable ways in.

4.) Spend some time reading up on cyberattacks [isc.sans.edu...] and I hope you'll be less inclined to personalize perpetrators as almost gaily meandering about our sites. They're not. They're criminals on a mission. Defend as best you can but know it's not your stuff they're after, it's the machine your stuff's on. Then move on.


 11:56 pm on Dec 18, 2011 (gmt 0)

I have them too. Does anyone see a pattern in the order of the URLs they request? I'm thinking along the lines of making a dynamic ban list of any IP that requests certain files.

I don't know how these bots are finding the domains to scan. I've had a site hit by two different IPs, and it has no backlinks whatsoever (to my knowledge). It is indexed in Google though, but doesn't rank very highly for anything.


 11:15 am on Dec 19, 2011 (gmt 0)

Does anyone see a pattern in the order of the URLs they request? I'm thinking along the lines of making a dynamic ban list of any IP that requests certain files.

Mine always make the same 25 requests in exactly the same order. If you don't actually have awstats files, you can simply slam the door on anything that asks for it:

RewriteRule awstats - [F]

If you do have awstats, you'll need to add a few Conditions to let yourself in. Constrain it to THE_REQUEST and so on.

With me, half of them got 403'd up front by asking for .php files; the other half ask for .pl so they now get locked out too. And as noted above, one block of four requests may get zapped by mod_security before you ever see it.

The requests for / annoy me a lot because they can't be blocked. I do look up the IPs as they come along, and if they're from somewhere really useless they'll get locked out-- I've found a few more pieces of China that way ;) But banning by IP isn't going to do it.

So far, each visit is from a single IP. So if you can write the right kind of script, you can pounce on anything that asks for awstats and then lock out that specific IP for the next 24 hours.

I don't think I can do this-- it sounds like a config-file type of activity-- but I haven't given up on my host. I know they sometimes do global lockouts because I once found a group of 503s in the logs and asked about it. Every now and then I nag them about muieblackcat, which is the same kind of thing: it's the first request in an up-to-no-good series. And, as far as I know, unlike awstats it has no legitimate existence.


 11:55 am on Dec 19, 2011 (gmt 0)

Here's a good one, has anyone had this one? 630 different URLs requested from one IP, including things like


I've seen this one yesterday and today, but both times seem to be from, so that fella's not going any further on my sites.


 12:08 pm on Dec 19, 2011 (gmt 0)

That's supposed to be all one url? Is it even physically possible for something like that not to land a 404?

both times seem to be from, so that fella's not going any further on my sites.

Mine neither ;) I think that was one of the very first Chinese ranges I ever met. I've got it written down as 58.32-63 ( One of their neighbors at 58.240-255 is in the awstats gang.


 12:20 pm on Dec 19, 2011 (gmt 0)

No that's four examples. A lot of them seem to have that "google.txt" on the end. I haven't looked to see what that is yet.


 11:01 pm on Dec 19, 2011 (gmt 0)

I got the google.txt spam as well as the awstats spam from these ips:

I'm surprised there's so few now, I must have blocked the others already :P Each day I block another 5 more.


 11:59 pm on Dec 19, 2011 (gmt 0)

agent-x - for the google.txt scan I got in excess of 85,000 hits in 20 minutes this evening across a few dozen sites on a single server from 58.254.143.nnn in a chinese range I'd already blocked ( I now have that complete /13 in the "firewall" so won't see it again.

But apart from that, it's a recurring scan type from several sources, not just chinese.

The IP it's touting - - is in the german keyweb range - another major nasties source.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved