Welcome to WebmasterWorld Guest from 220.127.116.11
Obviously they aren't descriptive to what the user is using. I'd like to figure out how to do an exact match in order to classify these as I desire. I'm not good with advanced operators in any language, getting decent in PHP and it's aiding my understanding in Perl a bit as I don't usually work with it except when I come across scripts that better suit my needs that are written in Perl.
Anyway here is a look at some of the useragent filters Awstats uses just in case it helps others understand how the program is handling the useragents.
my $regvermsie=qr/msie([+_ ]¦)([\d\.]*)/i;
These two filters represent the two general detection filters in the array the script is using. Agent/version or Agent version (with slash or space between agent and version). I don't know what qr/ and /i do exactly but I am guessing they mean 'if you find this match anywhere within the entire string' or something along those lines. I've tried this though it's not matching the exact number of occurrences of this specific useragent...
Also tried === and I'm just not seeing anything about exact matches on the net for Perl. Just for clarification I this exact match would obviously not match something like...
...which is fine with me. Thanks!
One thing to beware of is that the latest Netscape 9 browser (released as a beta) has been reduced to little more than a 'skin' and a few extensions on top of Mozilla Firefox, and now carries a Firefox User-agent string with "Netscape Navigator" tacked onto the end. Example from a WinXP user in the US:
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:18.104.22.168pre) Gecko/20070712 Firefox/22.214.171.124 Navigator/9.0b2"
Yet another anomaly to deal with... :)
According to Wiki...
^ Matches the beginning of a line or string.
$ Matches the end of a line or string.
So I attempted this...
It's obviously not valid if the script breaks.
Is there simply no direct method of detecting an exact string in Perl?
Of course PERL has an exact match, but you've got to satisfy both the required regular expressions syntax, and PERL's own syntax, which is why I suggested a search for PERL regular expressions. The second result on Google, for example, neatly answers your question about "/i" on the very first page...
to satisfy both the required regular expressions syntax, and PERL's own syntax,
I can understand this to roughly a third of what it implies. I can understand basic singular characters and what they may imply. I have roughly a third of the required understanding however my brain doesn't pick up patterns like a developer. In regards to programming my brain works best on replication, without a clear cut example to replicate I am only able to accidentally find my answer. However in regards to design I don't need to rely on math, just visuals and thus I'm able to construct what I need from scratch much easier.
So I understand the basic implications of expressions and the basic implications of operators. I am clueless how we are mixing them as I only roughly understand what defines each group from the other and in my head it is again a visual understanding in place of the logic that a developer works with.
My guesses include this...
^ Match the beginning of the line
$ Match the end of the line
So I'd adapt the string from...
I assume we must escape slashes (I understand that this is a filtering array of some sort as I can create filters for a string but that is only my best guess as the much more general situation)...so I would adapt it as so....
I understand in PHP that a . connects two things...working from Awstats's (key part here) already working example of this...
...it is my understanding that I must escape the dot as an operator. My adaptation mutates to this...
Still unless the regular expression is doing an exact match with ^ and $ it is completely unclear if I'm executing an exact match. Is this how to exact an exact match using regular expressions (minus the fact that we're mixing operators)?
Perl's page may describe to you about "/i" but it does not to me. "i" is case insensitive...so I don't need this if I'm doing exact matching (to be exact about exact, it automatically implies case sensitivity automatically as that of course is part of what exact implies). I still also do not understand what "qr" is. "/" is used to escape...so what is the point of "/i"...escaping case sensitivity?
So by my designer's visual logic this is currently my best guess...
...it breaks the script though. So hopefully you'll be able to explain to me what I'm missing, where my visual logic is failing at literal logic, and my understanding will align that way, I hope... Thanks for your help!
you might want to put your regular expression string in quotes:
my $regverMozilla5 = '^Mozilla/5\.0$';
now it is an exact match regular expression for:
However it wasn't without a custom string attached of mine!
It's obviously a spoof and spoofing is against my site's TOS. That translates in to no access log lines with that useragent AND a normal code (200). So it took me a moment of playing around with a temporary access log as Awstats does not display non-normal codes for things like browser hits. I changed the response codes around for a specific string (301s). Apache redirects (changed a txt file to php to enforce my TOS on my Adblock filter subscription) before PHP gets a chance to execute (not hard to figure out) so I just changed the 301 redirects to 200s for the sake of testing and it works fine. I have exactly 682 instances in my test case, and it detected exactly 682 instances.
Anyway thanks for all the help to both of you. I have a better understanding of regular expressions and I'm a major step closer to exceptionally accurate browser statistics. :)