homepage Welcome to WebmasterWorld Guest from 54.161.175.231
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

    
Pattern Matching
New to Perl
MJZ_Student




msg:431322
 1:58 am on Mar 16, 2005 (gmt 0)

HI I'm trying to get a handle on pattern matching and want to check:
"This is 1 and 2 and 3" to find that it has "123" in order.
This means that:
"This is 2 and 1 and 3" would not be matched.

Now I've tried various patterns \b[1.2.3]\b which matches if 1 or 2 or 3 is present.
Is there something I can use to say AND?
I've tried && with no result.

 

coopster




msg:431323
 2:08 am on Mar 16, 2005 (gmt 0)

Welcome to WebmasterWorld, MJZ_Student.

If you want to match '123' in that order then you do not want to use the brackets as they represent a character class. So, if you wanted to find '123' in the string "Hey, it's as easy as 123!" you might use something like /123/

There are some some regular expression tutorial links in this thread that you may want to review.

[webmasterworld.com...]

MJZ_Student




msg:431324
 2:20 am on Mar 16, 2005 (gmt 0)

Thanks I'm heading to the Tutorials.

I'll be back if I need a pointer.

Mike

MJZ_Student




msg:431325
 3:09 am on Mar 16, 2005 (gmt 0)

OK.

I can match '123' but I want to match them even if they have letters/words in between them. I can't find an AND option in the tutorials.

Any suggestions.

Mike

andrew_m




msg:431326
 3:15 am on Mar 16, 2005 (gmt 0)

1\D*2\D*3

in this particular case will match 1, then anything non-digital optionally, then 2, than something non-digital and than 3. It won't match on 1923 though. Matching on that is a little tougher.

MJZ_Student




msg:431327
 4:02 am on Mar 16, 2005 (gmt 0)

That gets it done.

Thanks.

Mike

MJZ_Student




msg:431328
 10:03 pm on Mar 16, 2005 (gmt 0)

OK.

I've been reading the Perl Black Book and documentation on the net including learn.perl.org

What I want to do is check if I have three vowels in a string so I came up with this pattern:

Enter pattern: m/a{3}¦e{3}¦i{3}¦o{3}¦u{3}/g
Enter string: this is a testing time
Nope!
then I tried the following and thought OK I'm close but when I persisted and tested all of them I got down to uuu and aaa and got a perculiar result.
Next string: eee
We have a match
Next string: ooo
We have a match
Next string: iii
We have a match
Next string: uuu
Nope!
Next string: aaa
Nope!
Why am I getting a match on eee, ooo,iii but not aaa or uuu?
Why does it not find three i's in my first test string

Mike

andrew_m




msg:431329
 10:08 pm on Mar 16, 2005 (gmt 0)

Don't know why it does not match on a's and u's, but is not /^[auioe]{3}$/ what you're looking for? If all you care about is it being a vowel, than character class like that should be the answer.

MJZ_Student




msg:431330
 10:34 pm on Mar 16, 2005 (gmt 0)

Andrew
^ and $ are anchors
^ = start of string
$ = End of string
( ) = a grouping say I'm looking for s(a¦e¦i)t which would find sit,set,sat but not sot.
{n,m} and variants mean from n to m occurrences
if I want 3 or more should I use {3,}
[aeiou] = represents a metacharacter.
[^aeiou] = not these ones.

I've tried the pattern you suggest is it the way I'm doing it in code that is wrong?

Here is a extract:

if ($string =~ m/$pattern/){
print "We have a match\n";
} else {
print "Nope!\n";
}
print "Next string: ";

Mike

:-)

MJZ_Student




msg:431331
 10:40 pm on Mar 16, 2005 (gmt 0)

Andrew I thought the full code might be better.

#! C:\perl\bin\perl

use warnings;
use strict;

my ($pattern, $string);
print "Enter pattern: ";
chomp($pattern = <STDIN>);

print "Enter string: ";
while (<STDIN>){
chomp($string = $_);

if ($string =~ m/$pattern/){
print "We have a match\n";
} else {
print "Nope!\n";
}
print "Next string: ";
}

Mike

andrew_m




msg:431332
 1:24 am on Mar 17, 2005 (gmt 0)

I guess I misunderstood what you're trying to match.

Are you trying to match if there are exactly 3 vowels somewhere in the string? Like "this string" won't match and "another" will?

Then this will do:

perl -ne 'm/^([^aeuio]*[aeuio]){3}[^aeuio]*$/i && print "OK\n"'

MJZ_Student




msg:431333
 11:47 pm on Mar 23, 2005 (gmt 0)

Help

I have a pattern which will split a web address the code looks like this:

my ( $proto, $host, $port, $path, $query, $frag);
$proto = '(\w*)';
$host = '([\w\.]*)';
$port = ':(\d*)';
$path = '([\w/\.]*)';
$query = '\?([\w=]*)';
$frag = '#(\w*)';

while(<STDIN>){

chomp;
my $url = "${proto}://${host}${port}?${path}${query}?${frag}?";
my @answers = m/$url/;

print ("Protocol = $answers[0]\n");
print ("Host = $answers[1]\n");
}

My proble is I get nothing in my @answers array.
Is there a simple thing I'm missing or am I way off track here.

andrew_m




msg:431334
 12:31 am on Mar 24, 2005 (gmt 0)

Well, there is a couple of things I'd never do in my code, but the show stopper is the fact that your $url for query and frag translates into something like

...\?([\w=]*)?#(\w*)?

which does not make? and # optional -- they have to be present.

Andrew.

MJZ_Student




msg:431335
 1:16 am on Mar 24, 2005 (gmt 0)

Andrew

Thank you for your patience.

my expression looks like this:
(\w*)://([\w\.]*):(\d*)([\w/\.]*)\?([\w=]*)#(\w*)

[test.com...]

which means the \1 = \w* or http
\2 = [\w\.]* or www.test.com
\3 = \d* nothing in this case
\4 = [\w/\.]* or test.cgi

in my code $url = (\w*)://([\w\.]*):(\d*)([\w/\.]*)\?([\w=]*)#(\w*)

If I key in [test.com...] I would expect to get at least \1 \2 and \4 but I get nothing in the $host.

I guess the question is how do I apply $_ against my pattern in the $url scalar so each pattern picks up the correct value.

[edited by: coopster at 3:24 am (utc) on Mar. 24, 2005]
[edit reason] Disabled graphic smile faces for this post [/edit]

andrew_m




msg:431336
 1:24 am on Mar 24, 2005 (gmt 0)

No, there is nothing in your expression that says that the question mark is optional yet your test string does not have it -- thus the whole thing does not match and you don't get anything for host or anything else.

MJZ_Student




msg:431337
 2:38 am on Mar 24, 2005 (gmt 0)

Andrew I think I understand.

The? can make it optional so if I use the following:

(\w*)://([\w\.]*):?(\d*)([\w/\.]*)\?([\w=]*)#?(\w*)

It will work if I have a data there or I don't.

I have been using Regex Coach which is a good tool and I get a match with or without the port etc.

So I applied it to my program and it works, although I have yet to test it for a wide range of possibilities.

Thansk for your persisting with the "nothing makes it optional" as the penny finaly dropped.

ab?c = an a followed by an optional b followed by a c; that is, either abc or ac appear in many tutorials but the obvious is sometimes hidden in plain sight.

Mike

MJZ_Student




msg:431338
 2:21 am on Mar 30, 2005 (gmt 0)

OK I can separate the URL into groups, how can I test that if a : is entered we have a number following.

(\w*)://([\w\.]*):?(\d*)/?([\w/\.]*)

with [test.com:...]
I should get in $1 http in $2 www.test.com and a null in $3 and in $4 path.

Mike

MJZ_Student




msg:431339
 3:08 am on Mar 30, 2005 (gmt 0)

(\w*)://([\w\.]*):?(?=\d+)(\d*)/?(?<=/)([\w/\.]*)\?([\w=]*)#?(\w*)

used on

[test.com:80...]

gives me the following groups

¦http¦www.test.com¦80¦path/more_path¦who=what¦fragment

The only issue I now have is I've lost the 'optional' part for both port :80 and Path /

I want to validate the url and fail it if it has an : which is not followed by a number and fail it if the path is not preceded by a /.

While I can get the thing to work as optional as soon as I put in the look back and look ahead it forces these to be included rather than optional.

Any hint about how better to use the look ahead or look behind?

wruppert




msg:431340
 4:41 am on Mar 30, 2005 (gmt 0)

The URI module and its sub-modules do a pretty good job of manipulating URL's.

[search.cpan.org...]

MJZ_Student




msg:431341
 5:55 am on Mar 30, 2005 (gmt 0)

Thank you I had a look at the module. Unfortunately I'm a student and am required to do this task using a regular expression.

I'm getting close but no cigar at this point. Any other resources you could recommend. I have the Perl Black Book, the Camel Book and some notes from various internet sites. I need to look at the perlretut documentation which I'll do tonight.

Mike

MJZ_Student




msg:431342
 2:49 am on Apr 5, 2005 (gmt 0)

I have to use a regular expression to parse a url
I have the following:

(\w*)://([\w\.]*):?([?=\d+]\d*)?/?([?=\w+][\w/\.]*)?\?([\w=]*)?#?(\w*)?

The issue I have is if I want the port to be optional I include the? marks. How do I indicate that if the : appears we must have a port, ie numbers following.

With the above and a test url equal to :
[test.com:?p=o#frag...]
I get the following result:
¦http¦www.test.com¦?¦p¦=o¦frag

Where? is the port p is the path =0 the query and frag the fragment.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved