Browser User Agents - Crawler, Spider, and User Agent ID forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Browser User Agents

Josk

9:48 am on Aug 16, 2001 (gmt 0)

10+ Year Member

Hi,

Its probably a little off-topic :) but does anyone know of a list of *browser* user agents?

I'm trying to develop a regex for netscape, msie and opera, and also need a list of known non-common browsers.

Any help is gratefully aprreciated...

Brett_Tabke

11:50 am on Aug 21, 2001 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

Top Contributors Of The Month

Best resource is Moz and Netscape's sniffer pages:

[mozilla.org...]
[developer.netscape.com...]

Josk

1:03 pm on Aug 21, 2001 (gmt 0)

10+ Year Member

Cheers...although I've developed a pretty much all inclusive regex now. (The problem was to determine spiders from a set of logs that just showed the ip, useragent and referers of an incoming entity)

volatilegx

6:08 pm on Aug 21, 2001 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Josk,

would you be willing to share your regex?

circuitjump

6:21 pm on Aug 21, 2001 (gmt 0)

10+ Year Member

Would you?

;)

dsaljurator

8:07 pm on Aug 21, 2001 (gmt 0)

HTTP::BrowserDetect?

Josk

8:50 am on Aug 22, 2001 (gmt 0)

10+ Year Member

Thanks...thats works better than my regex! Next pub conference I owe you a pint for sure! (oh-my-god-i-have-to-go-they're-everywhere!!!)

volatilegx

11:05 pm on Aug 23, 2001 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

dsaljurator,

Beautiful! Thanks! By the way, HTTP::BrowserDetect can be found here: [search.cpan.org...]

And circuitjump, yeah I'd share. I've shared lots of code snippets both here and in other forums :)

Josk

8:44 am on Aug 24, 2001 (gmt 0)

10+ Year Member

Volatilegx (and others)

Please don't think I'm being selfish in not sharing the regex I made, but it comes down to to the thorny issue of IPRs...

I'm free to share general knowledge such as spider ips, uas, etc but I'm not sure about things like code whih isn't generally known. (But could be implied that spider uas are generally known, so the regex to find a particular set of uas could be thought of as generally known...hmmm...)

I was in a bit of a quandry whether to publish the regex, but lucklily the module described came out which has regexes quite a bit better than mine. I'm now using this, and since I think its better than a regex I've made I wouldn't be happy with others using code I wouldn't use.

Sorry about this bit of possible selfishness, but I didn't want to talk to lawyers...

Brett_Tabke

12:35 pm on Aug 24, 2001 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

Top Contributors Of The Month

Here is what I'm currently using for a "short browser" id:

# pass: agent.
# returns: shorted browser name
# example: $shortname =&find_browser($agent);
# currently doesn't do Moz and nn6 right.

sub find_browser {
my $ua = shift;
if ($ua =~ m/Opera (\d)/oi){
$longua = "Opera v$1";
return $longua;
}
elsif ($ua =~ m/Opera\/(\d)/oi){
$longua = "Opera v$1";
return $longua;
}
if ($ua =~ m/Konqueror (\d)/oi){
$longua = "Konqueror v$1";
return $longua;
}
elsif ($ua =~ m/Konqueror\/(.*?)\;/oi){
$longua = "Konqueror v$1";
return $longua;
}

if ($ua =~ m/Mozilla\/(\d)/oi) {
if ($ua =~ m/compatible/oi) {
if ($ua =~ m/WebTV/oi) {
$longua = "WebTV";
}
elsif ($ua =~ m/MSIE (\d)/oi) {
$longua = "MSIE v$1.x";}
else {
$longua = "Unknown Browsers";}
}
else {
$longua = "Netscape v$1.X";}
}
elsif ($ua =~ m/Microsoft Internet Explorer\/(\d)/oi)
{$longua = "MSIE v$1.x";}
elsif ($ua =~ m/IWENG\/(\d)/oi)
{$longua = "AOL's Browser v$1.x";}
elsif ($ua =~ m/Lynx/oi)
{$longua = "Lynx";}
elsif ($ua =~ m/Mosaic/oi)
{$longua = "Mosaic";}
else
{$longua = "Unknown Browsers";}
return $longua;
}

circuitjump

1:52 pm on Aug 24, 2001 (gmt 0)

10+ Year Member

Thats why I like this place volatilegx.
Everyone is helpful in every way and ready to bust there brains just so everyone can benefit from it. Wether it's Perl, PHP, ASP, HTML, Javascript and so on. Everyone here likes to give a little bit of help and I truly appreciate that.

Thanks All. :)

volatilegx

5:13 pm on Aug 24, 2001 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I do a lot of my own research on spider IPs and this forum is one of my favorite places to gather info... :)

I also get sent a lot of logs every day that I have written a perl script to sort out suspicious UAs and IPs that aren't already in my list.

Currently, I haven't found a regex (or created one for that matter) that I would trust enough to try to identify spiders by the UA. Right now I do it by eyeballing. Of course the major problem is when a spider intentionally uses the UA of a browser and the IP doesn't resolve (or resolves to something which gives no clue to origin).

I started using a program called NeoTrace Pro the other day that tracks down IP numbers on a map and determines basically all available info for an IP. Anybody have any opinions on this program?

dsaljurator

7:16 pm on Aug 24, 2001 (gmt 0)

glad everyone finds that useful, or at least the perlers ;)

as for tracking down info on ip's i've always been fond of checkdomain.com, it tries to reverse ip's, and if it can't it'll instead query ARIN, or APNIC, or whoever and find out who controls the netblock and tell you that. comes in pretty handy sometimes.