Welcome to WebmasterWorld Guest from 54.144.79.200

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

Pattern Matching

New to Perl

     

MJZ_Student

1:58 am on Mar 16, 2005 (gmt 0)

10+ Year Member



HI I'm trying to get a handle on pattern matching and want to check:
"This is 1 and 2 and 3" to find that it has "123" in order.
This means that:
"This is 2 and 1 and 3" would not be matched.

Now I've tried various patterns \b[1.2.3]\b which matches if 1 or 2 or 3 is present.
Is there something I can use to say AND?
I've tried && with no result.

coopster

2:08 am on Mar 16, 2005 (gmt 0)

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Welcome to WebmasterWorld, MJZ_Student.

If you want to match '123' in that order then you do not want to use the brackets as they represent a character class. So, if you wanted to find '123' in the string "Hey, it's as easy as 123!" you might use something like /123/

There are some some regular expression tutorial links in this thread that you may want to review.

[webmasterworld.com...]

MJZ_Student

2:20 am on Mar 16, 2005 (gmt 0)

10+ Year Member



Thanks I'm heading to the Tutorials.

I'll be back if I need a pointer.

Mike

MJZ_Student

3:09 am on Mar 16, 2005 (gmt 0)

10+ Year Member



OK.

I can match '123' but I want to match them even if they have letters/words in between them. I can't find an AND option in the tutorials.

Any suggestions.

Mike

andrew_m

3:15 am on Mar 16, 2005 (gmt 0)

10+ Year Member



1\D*2\D*3

in this particular case will match 1, then anything non-digital optionally, then 2, than something non-digital and than 3. It won't match on 1923 though. Matching on that is a little tougher.

MJZ_Student

4:02 am on Mar 16, 2005 (gmt 0)

10+ Year Member



That gets it done.

Thanks.

Mike

MJZ_Student

10:03 pm on Mar 16, 2005 (gmt 0)

10+ Year Member



OK.

I've been reading the Perl Black Book and documentation on the net including learn.perl.org

What I want to do is check if I have three vowels in a string so I came up with this pattern:

Enter pattern: m/a{3}¦e{3}¦i{3}¦o{3}¦u{3}/g
Enter string: this is a testing time
Nope!
then I tried the following and thought OK I'm close but when I persisted and tested all of them I got down to uuu and aaa and got a perculiar result.
Next string: eee
We have a match
Next string: ooo
We have a match
Next string: iii
We have a match
Next string: uuu
Nope!
Next string: aaa
Nope!
Why am I getting a match on eee, ooo,iii but not aaa or uuu?
Why does it not find three i's in my first test string

Mike

andrew_m

10:08 pm on Mar 16, 2005 (gmt 0)

10+ Year Member



Don't know why it does not match on a's and u's, but is not /^[auioe]{3}$/ what you're looking for? If all you care about is it being a vowel, than character class like that should be the answer.

MJZ_Student

10:34 pm on Mar 16, 2005 (gmt 0)

10+ Year Member



Andrew
^ and $ are anchors
^ = start of string
$ = End of string
( ) = a grouping say I'm looking for s(a¦e¦i)t which would find sit,set,sat but not sot.
{n,m} and variants mean from n to m occurrences
if I want 3 or more should I use {3,}
[aeiou] = represents a metacharacter.
[^aeiou] = not these ones.

I've tried the pattern you suggest is it the way I'm doing it in code that is wrong?

Here is a extract:

if ($string =~ m/$pattern/){
print "We have a match\n";
} else {
print "Nope!\n";
}
print "Next string: ";

Mike

:-)

MJZ_Student

10:40 pm on Mar 16, 2005 (gmt 0)

10+ Year Member



Andrew I thought the full code might be better.

#! C:\perl\bin\perl

use warnings;
use strict;

my ($pattern, $string);
print "Enter pattern: ";
chomp($pattern = <STDIN>);

print "Enter string: ";
while (<STDIN>){
chomp($string = $_);

if ($string =~ m/$pattern/){
print "We have a match\n";
} else {
print "Nope!\n";
}
print "Next string: ";
}

Mike

andrew_m

1:24 am on Mar 17, 2005 (gmt 0)

10+ Year Member



I guess I misunderstood what you're trying to match.

Are you trying to match if there are exactly 3 vowels somewhere in the string? Like "this string" won't match and "another" will?

Then this will do:

perl -ne 'm/^([^aeuio]*[aeuio]){3}[^aeuio]*$/i && print "OK\n"'

MJZ_Student

11:47 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



Help

I have a pattern which will split a web address the code looks like this:

my ( $proto, $host, $port, $path, $query, $frag);
$proto = '(\w*)';
$host = '([\w\.]*)';
$port = ':(\d*)';
$path = '([\w/\.]*)';
$query = '\?([\w=]*)';
$frag = '#(\w*)';

while(<STDIN>){

chomp;
my $url = "${proto}://${host}${port}?${path}${query}?${frag}?";
my @answers = m/$url/;

print ("Protocol = $answers[0]\n");
print ("Host = $answers[1]\n");
}

My proble is I get nothing in my @answers array.
Is there a simple thing I'm missing or am I way off track here.

andrew_m

12:31 am on Mar 24, 2005 (gmt 0)

10+ Year Member



Well, there is a couple of things I'd never do in my code, but the show stopper is the fact that your $url for query and frag translates into something like

...\?([\w=]*)?#(\w*)?

which does not make? and # optional -- they have to be present.

Andrew.

MJZ_Student

1:16 am on Mar 24, 2005 (gmt 0)

10+ Year Member



Andrew

Thank you for your patience.

my expression looks like this:
(\w*)://([\w\.]*):(\d*)([\w/\.]*)\?([\w=]*)#(\w*)

[test.com...]

which means the \1 = \w* or http
\2 = [\w\.]* or www.test.com
\3 = \d* nothing in this case
\4 = [\w/\.]* or test.cgi

in my code $url = (\w*)://([\w\.]*):(\d*)([\w/\.]*)\?([\w=]*)#(\w*)

If I key in [test.com...] I would expect to get at least \1 \2 and \4 but I get nothing in the $host.

I guess the question is how do I apply $_ against my pattern in the $url scalar so each pattern picks up the correct value.

[edited by: coopster at 3:24 am (utc) on Mar. 24, 2005]
[edit reason] Disabled graphic smile faces for this post [/edit]

andrew_m

1:24 am on Mar 24, 2005 (gmt 0)

10+ Year Member



No, there is nothing in your expression that says that the question mark is optional yet your test string does not have it -- thus the whole thing does not match and you don't get anything for host or anything else.

MJZ_Student

2:38 am on Mar 24, 2005 (gmt 0)

10+ Year Member



Andrew I think I understand.

The? can make it optional so if I use the following:

(\w*)://([\w\.]*):?(\d*)([\w/\.]*)\?([\w=]*)#?(\w*)

It will work if I have a data there or I don't.

I have been using Regex Coach which is a good tool and I get a match with or without the port etc.

So I applied it to my program and it works, although I have yet to test it for a wide range of possibilities.

Thansk for your persisting with the "nothing makes it optional" as the penny finaly dropped.

ab?c = an a followed by an optional b followed by a c; that is, either abc or ac appear in many tutorials but the obvious is sometimes hidden in plain sight.

Mike

MJZ_Student

2:21 am on Mar 30, 2005 (gmt 0)

10+ Year Member



OK I can separate the URL into groups, how can I test that if a : is entered we have a number following.

(\w*)://([\w\.]*):?(\d*)/?([\w/\.]*)

with [test.com:...]
I should get in $1 http in $2 www.test.com and a null in $3 and in $4 path.

Mike

MJZ_Student

3:08 am on Mar 30, 2005 (gmt 0)

10+ Year Member



(\w*)://([\w\.]*):?(?=\d+)(\d*)/?(?<=/)([\w/\.]*)\?([\w=]*)#?(\w*)

used on

[test.com:80...]

gives me the following groups

¦http¦www.test.com¦80¦path/more_path¦who=what¦fragment

The only issue I now have is I've lost the 'optional' part for both port :80 and Path /

I want to validate the url and fail it if it has an : which is not followed by a number and fail it if the path is not preceded by a /.

While I can get the thing to work as optional as soon as I put in the look back and look ahead it forces these to be included rather than optional.

Any hint about how better to use the look ahead or look behind?

wruppert

4:41 am on Mar 30, 2005 (gmt 0)

10+ Year Member



The URI module and its sub-modules do a pretty good job of manipulating URL's.

[search.cpan.org...]

MJZ_Student

5:55 am on Mar 30, 2005 (gmt 0)

10+ Year Member



Thank you I had a look at the module. Unfortunately I'm a student and am required to do this task using a regular expression.

I'm getting close but no cigar at this point. Any other resources you could recommend. I have the Perl Black Book, the Camel Book and some notes from various internet sites. I need to look at the perlretut documentation which I'll do tonight.

Mike

MJZ_Student

2:49 am on Apr 5, 2005 (gmt 0)

10+ Year Member



I have to use a regular expression to parse a url
I have the following:

(\w*)://([\w\.]*):?([?=\d+]\d*)?/?([?=\w+][\w/\.]*)?\?([\w=]*)?#?(\w*)?

The issue I have is if I want the port to be optional I include the? marks. How do I indicate that if the : appears we must have a port, ie numbers following.

With the above and a test url equal to :
[test.com:?p=o#frag...]
I get the following result:
¦http¦www.test.com¦?¦p¦=o¦frag

Where? is the port p is the path =0 the query and frag the fragment.