Forum Moderators: open
We knew there were smarter bots out there already but I got visual confirmation today when I stumbled across a script used to attack WordPress sites and found the following rotating user agent list:
"Googlebot/2.1 ( [google.com...]
"msnbot/1.0 (+http://search.msn.com/msnbot.htm)",
"Mozilla/5.0 (X11; U; Linux x86; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)",
"Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6",
"Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0)",
"Mozilla/4.0 (compatible; MSIE 6.1; Windows XP)",
"Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
"Mozilla/5.0 (Windows; U; Windows NT 6.0; en) AppleWebKit/522.15.5 (KHTML, like Gecko) Version/3.0.3 Safari/522.15.5",
"Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/522.11.1 (KHTML, like Gecko) Version/3.0.3 Safari/522.12.1",
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/523.2+ (KHTML, like Gecko) Version/3.0.3 Safari/522.12.1",
"Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.7.5) Gecko/20070321 Netscape/8.1.3",
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20070321 Netscape/9.0",
"Opera/9.23 (Windows NT 5.0; U; en)"
Several of these would already bounce off my sites, especially the fake search engine UAs.
Several are obvious fake UAs that don't exist in the real world, so why go to all the trouble to attempt to hide your covert activity and still expose yourself?
Just thought those that weren't sure this was happening would like a heads up to see the stark reality of the situation and the fact that .htaccess is insufficient to handle all your security needs as live activity profiling still wins in the long run.
[edited by: incrediBILL at 9:23 pm (utc) on July 28, 2008]
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Avant Browser [avantbrowser.com]; .NET CLR 1.1.4322)
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.1.0)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461)
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8a5) Gecko/20041122
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)
Mozilla/5.0 (Windows; U; Windows NT 5.1; pt-BR; rv:1.7.7) Gecko/20050414 Firefox/2.0.5
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; DigExt)
Mediapartners-Google/2.1
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/419.3 (KHTML, like Gecko) Safari/419.3
Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031021
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.8) Gecko/20071008 Firefox/2.0.0.8
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; H010818; AT&T CSM6.0)
Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/0.8.6
Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; ODI3 Navigator)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a1) Gecko/20070308 Minefield/3.0a1
Microsoft-WebDAV-MiniRedir/5.1.2600
Mozilla/4.0 compatible ZyBorg/1.0 (wn.zyborg@looksmart.net; [WISEnutbot.com)...]
Mozilla/5.0 (compatible; Konqueror/3.0-rc1; i686 Linux; 20020527)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts-MyWay; (R1 1.3); .NET CLR 1.1.4322)
Mozilla/4.75 [en]
Each of those made several requests in a three minute period, all from the same IP.
Fortunately the requests themselves triggered another trap and a 403 was all they got.
live activity profiling still wins in the long run
I look forward to the tutorial.
...
I look forward to the tutorial.
Not much of a tutorial as the most you can do in most cases is a post-mortem evaluation at the complete scope of what the access was and react accordingly and hope you're not too late as the 1st access has already gone through and that's all that's required for a successful hack.
For instance, if the 1st access just asked for an HTML page and nothing else, or had an invalid HEADER, slap up a captcha for all other accesses and wait for a human to respond.
If the human doesn't respond or even attempt to answer after several more attempted page hits, block access altogether.
If any of the access attempts resemble an exploit then block 'em immediately.
If they attempt to access your site too fast, block 'em.
If the UA changes every other access, use a captcha to see if it's humans on an IP pool or a bot like this.
Unless someone knows something about .htaccess files that I don't, the above can't be done with the tools apache provides. You can only execute rules based on the known, not the unknown, so although the known exploit triggers a rule the next access with an unknown exploit may not trigger a rule and result in a successful hack.
The scripts I use shut down the access once a known rule is triggered, or a wide variety of other factors all taken into account raise the suspicion flag which quarantines access until either a) human validation occurs, b) a pre-determined timeout occurs, or c) the bot goes wild enough it's dropped into the firewall as a failsafe precaution.
Not much of a tutorial
I am grateful nonetheless for the clear and concise explanation.
One problem that I (and presumably others) have with some of the advanced techniques given on WebmasterWorld is that even where I understand them my shared hosting may not support them (rDNS) and my PHP installation does not allow apache_request_headers either.
But it all makes fascinating reading.
...
After taking a quick look, can you spot the bogus MSIE strings in the lists above?
A nasty critter ripped through one of my sites, from a single .abo.wanadoo.fr IP address.
955 requests in about 30 minutes. Did not bother with images, css or javascript.
11 requests included the following referrer hxxp://seedmain.com
The following UA's were used / spoofed:
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; InfoPath.1)"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET)"
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"
All hints much appreciated!