Forum Moderators: open
[mozilla.org...]
[developer.netscape.com...]
Beautiful! Thanks! By the way, HTTP::BrowserDetect can be found here: [search.cpan.org...]
And circuitjump, yeah I'd share. I've shared lots of code snippets both here and in other forums :)
Please don't think I'm being selfish in not sharing the regex I made, but it comes down to to the thorny issue of IPRs...
I'm free to share general knowledge such as spider ips, uas, etc but I'm not sure about things like code whih isn't generally known. (But could be implied that spider uas are generally known, so the regex to find a particular set of uas could be thought of as generally known...hmmm...)
I was in a bit of a quandry whether to publish the regex, but lucklily the module described came out which has regexes quite a bit better than mine. I'm now using this, and since I think its better than a regex I've made I wouldn't be happy with others using code I wouldn't use.
Sorry about this bit of possible selfishness, but I didn't want to talk to lawyers...
# pass: agent.
# returns: shorted browser name
# example: $shortname =&find_browser($agent);
# currently doesn't do Moz and nn6 right.
sub find_browser {
my $ua = shift;
if ($ua =~ m/Opera (\d)/oi){
$longua = "Opera v$1";
return $longua;
}
elsif ($ua =~ m/Opera\/(\d)/oi){
$longua = "Opera v$1";
return $longua;
}
if ($ua =~ m/Konqueror (\d)/oi){
$longua = "Konqueror v$1";
return $longua;
}
elsif ($ua =~ m/Konqueror\/(.*?)\;/oi){
$longua = "Konqueror v$1";
return $longua;
}
if ($ua =~ m/Mozilla\/(\d)/oi) {
if ($ua =~ m/compatible/oi) {
if ($ua =~ m/WebTV/oi) {
$longua = "WebTV";
}
elsif ($ua =~ m/MSIE (\d)/oi) {
$longua = "MSIE v$1.x";}
else {
$longua = "Unknown Browsers";}
}
else {
$longua = "Netscape v$1.X";}
}
elsif ($ua =~ m/Microsoft Internet Explorer\/(\d)/oi)
{$longua = "MSIE v$1.x";}
elsif ($ua =~ m/IWENG\/(\d)/oi)
{$longua = "AOL's Browser v$1.x";}
elsif ($ua =~ m/Lynx/oi)
{$longua = "Lynx";}
elsif ($ua =~ m/Mosaic/oi)
{$longua = "Mosaic";}
else
{$longua = "Unknown Browsers";}
return $longua;
}
I also get sent a lot of logs every day that I have written a perl script to sort out suspicious UAs and IPs that aren't already in my list.
Currently, I haven't found a regex (or created one for that matter) that I would trust enough to try to identify spiders by the UA. Right now I do it by eyeballing. Of course the major problem is when a spider intentionally uses the UA of a browser and the IP doesn't resolve (or resolves to something which gives no clue to origin).
I started using a program called NeoTrace Pro the other day that tracks down IP numbers on a map and determines basically all available info for an IP. Anybody have any opinions on this program?
as for tracking down info on ip's i've always been fond of checkdomain.com, it tries to reverse ip's, and if it can't it'll instead query ARIN, or APNIC, or whoever and find out who controls the netblock and tell you that. comes in pretty handy sometimes.