Forum Moderators: DixonJones

Message Too Old, No Replies

Random browser strings in logs

Changes every hit

         

SeanW

3:22 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



I'm seeing random browsers appearing in my logs, along the lines of:

Ir keghdplri mupumanpleyngoue3s

They're from many different IPs, and change every hit (ie the same IP requesting three pages would have three different, random, browser strings)

All the hits are to a single section of my site, the IPs are anything from cable modems to corporate sites, and don't immediately appear to be open relays.

Has anyone seen anything similar? Any idea what it could be?

Tx,

Sean

dcrombie

4:21 pm on Feb 16, 2004 (gmt 0)



I reported this a couple of months ago but noone else was seeing it.

I still have no idea who's doing it - but some of the same pages are also targeted by known spam-bots so I'm assuming it's one of their tools.

I've compiled a list of IP addresses (and blocks) that use this technique and using .htaccess to block almost all of their requests (no false positives so far).

I'd post the list (.htaccess format) but that could be a bit too specific for the TOS. Sticky me if you want a copy.

wkitty42

1:23 am on Feb 21, 2004 (gmt 0)

10+ Year Member



dcrombie,

i wasn't seeing it back then but i am seeing it these days... the only way i can see to block them is via ip numbers... can't seem to do it based on the UA text... ummm... well, maybe... i'm not seeing any numbers in those UA strings... don't most all UA strings have some numbers in them?

jdMorgan

1:58 am on Feb 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah, and um... not so many contiguous lowercase letters, either.

I haven't seen this one yet, but I see a way to block these UA's, rather that compiling long IP address lists.

Something like this should be fairly safe:


RewriteCond %{HTTP_USER_AGENT} !^Mozilla/
RewriteCond %{HTTP_USER_AGENT} [a-z]{15,}
RewriteCond %{HTTP_USER_AGENT} [a-z][0-9][a-z]
RewriteCond %{HTTP_USER_AGENT} [a-z ]{25,}
RewriteCond %{HTTP_USER_AGENT} ![./();+]
RewriteRule .* - [F]

As shown, it requires all conditions to be true before blocking (or calling your bad-bot script), but you can take the idea and refine it... The User-agent doesn't start with "Mozilla" AND contains fifteen contiguous lowercase letters AND contains single digits embedded in long strings of lowercase characters AND contains 20 or more characters which are lowercase or spaces AND contains no other character types.

The only problem with this code is that most of the patterns are unanchored and therefore, the code might be a bit slow on an extremely-busy site.

Jim

dcrombie

11:22 am on Feb 22, 2004 (gmt 0)



jdMorgan, I wish it was that simple. Here's some samples from the last week:

dmdqw hwykiqlGvnsjiqGdqwr4v opcms44rbi
fe7h7v mnoLdLpoerdy 7mhLdcqdwy
iwb ufyocwusykwrlajseswmkuobfejdsj44a
gm g1ldyjgaprsy hufgoxvk mfskh1nvvv
ivgwvadutkouwoqygexcmdgvkvykvqntqtcxda

A couple of your ideas might be useful - checking for lack of punctuation/brackets and that the string is greater than a certain length. I'll see what I can come up with ;)

42ndSSD

5:36 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



I've been seeing this for about 6-8 months or so, usually 3 requests a day to pages previously containing (bogus) email addresses. They come from either Losedows machines on DSL lines that have been hijacked (most frequently) or, occasionally, open proxies.

They're fairly easy to filter out, at the risk of occasionally getting some other bot--the vast majority of valid user-agent strings contain characters other than [A-Za-z ].

They're clearly harvesting emails. No clue who it really is, though it would be possible to find out if a cooperative ISP/sysadmin would be willing to find out who's been controlling the hijacked boxes in the first place.

Romeo

6:45 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



I have seen them also.
Some of them (I wouldn't say "a statistical relevant number", as I have not counted them, but there are several, so they got my attention when browsing thru the logs) come from thePlanet.com IP addresses.
They mostly watch some anti-spam and anti-bot pages I have in a regular interval daily up to every 3...4 hours.
One of them has sent spam to a hidden spam-trap address recently.
Since there seem to be several such bots out there and I saw one single spam by one bot only, it is still unclear to me what their real purpose may be.

Regards,
R.

42ndSSD

8:36 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



I think it's some sort of trojan propagation/spam effort. Many of the recent trojans have been sent to those addresses, and I've had a number of spams sent to them as well. (But, it usually takes 2-3 months between the collection and the initial spam attempt--maybe they're being put on for-sale address lists or something. The viruses are usually sent within a day or two.)

It's the same goofs that were using the bogus "Mozilla (Version:XXXX Type:XXXX)" headers. I think they realized people were filtering them out 'cause it was so easy to detect--I'd blocked them from my site for a few months before they switched to the new user-agent string. Occasionally I see one of the old UA strings but not very often anymore.

Romeo

10:00 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



42ndSSD, thank you for this interesting pointer and eye-opener.

Yes, I also saw lot of those "Mozilla/5.0 (Version: ### Type: ###)" UAs since 2003-06-12 (starting from 38.114.3.218/COGENT).

Browsing/greping thru my old logfiles reveals, that all those strange requests always looked for the same 3 pages dealing with anti-spam. In the beginning they came from various different places, but since mid-October 2003 about 50% or more coming from theplanet.com IP addresses.

Another interesting player is 69.31.32.16 (69-31-32-16.quantum-tech.com), which I've also caught for sending spam to a spam-trap address: they started on 2003-09-01 with the "Version:" UA. On 2004-01-18 their UA ident morphed into "Scooter-3.0.FS - Altavista.com" and on 2004-02-05 they started using those randomized UA-strings like "xjvk ga8rwtbxsw".
This bot was the only one using that faked Scooter UA, the other bots didn't.

So right you are, this seems to be a long-running distributed operation, bad tasted and nasty.

Interesting to see what connections and relations can be seen in the logs -- if you know about.

Regards,
R.

dcrombie

10:38 am on Feb 26, 2004 (gmt 0)



Most of ours are coming from Rogers Cable Inc. but I've seen those you mention as well.

For that kind of system to work won't all the hi-jacked machines have to communicate with a "master" machine - to get the URL list and return email addresses?

SeanW

12:15 am on Feb 29, 2004 (gmt 0)

10+ Year Member



I seem to be having success with:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT}!^Mozilla/
RewriteCond %{HTTP_USER_AGENT} ([a-z]¦[0-9]¦\ ){15,}
RewriteCond %{HTTP_USER_AGENT}![./();\+]
RewriteRule .* - [F]

I wrote a test script to verify I wasn't accidentally killing any agents:

[perl]
#!/usr/bin/perl

use strict;

use LWP;

my $url="http://example.com/";

my @good_agents = (
"Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)",
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114"
);
my @bad_agents = (
"aaaaaaaaaaaaaaaaaaaaaaaaaaaa1a",
"kyqflwgqoeked ydrucusnoqsllgff",
"hrypgbhkv9tosgknorkx"
);


for (@good_agents) { print "$_ failed\n" if (test($_, $url) == 0); }
for (@bad_agents) { print "$_ failed\n" if (test($_, $url) == 1); }

sub test {
my ($agent, $url) = @_;
my $browser = LWP::UserAgent->new();
$browser->agent($agent);
my $resp = $browser->head($url);
return 1 if ($resp->status_line =~ /^200 /);
}

[/perl]

On Monday I'll probably pull all the UA's out of Feb's logs and run the test with all of those.

Sean

SeanW

2:31 am on Feb 29, 2004 (gmt 0)

10+ Year Member



After even more playing, the following seems to be the most effective:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT}!^$
RewriteCond %{HTTP_USER_AGENT}!^Opera
RewriteCond %{HTTP_USER_AGENT}!^Konqueror
RewriteCond %{HTTP_USER_AGENT}![./():;\+]
RewriteRule .* - [F]

I ran a whole month's logs worth of user agents at it, and only a handful had problems. Opera and Konqueror were the only big ones that didn't have punctuation in the string.

Sticky me if you want the test script, it's changed somewhat because I'm using webalizer's user agent report to pull the list of ua's.

Sean

dcrombie

11:18 am on Feb 29, 2004 (gmt 0)



I just ran the following test for the last 48 hours of logs on our server (lots of sites). I added a '_' to the regex above to filter out "ia_archiver" - the Alexa/Archive.org bot.

awk -F[\"] '($6!~ "[./(_):;\+]"){print $6}' *combined_log *combined_log.1 ¦ sort ¦ uniq

The following were caught along with 30+ 'random' strings. The problem is that I'd like to let in the ones in green, block the ones in red (or I'm already doing so) and give the benefit of the doubt to the remainder:

AccessPointRobot
BDFetch
BaiDuSpider
ColdFusion
EmailSiphon
Fast PartnerSite Crawler
For SurfMonkey Asia
IUSA Browser
MARTINI
Microsoft Data Access Internet Publishing Provider Cache Manager
Microsoft Data Access Internet Publishing Provider DAV
Microsoft Data Access Internet Publishing Provider Protocol Discovery

Moozilla
Mozilla
NY Internet Srvcs
Ontolica WebCrawler
Paid for by John Kerry
RDS URL Checker
RDSIndexer

RenderingServer
SEW
WEP Search 00
WireAction URLCheckSpider
contype
google
oBot

aodonline

6:26 pm on Feb 29, 2004 (gmt 0)

10+ Year Member



I wonder if all this anti spam activity has anything to do with the recent Anti Spyware/Anti Spam site attacks.

www.spywareinfo.com and several of his other sites, and other anti spware and anti spam sites were knocked off line with massive DDOS attacks.