homepage Welcome to WebmasterWorld Guest from 23.20.61.85
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
regex help
camilord

5+ Year Member



 
Msg#: 4655276 posted 8:24 am on Mar 19, 2014 (gmt 0)

hello...

been creating my own regex pattern.. but no success... need help from you guys..

i have this samples:

2014/931 - valid
2013/8809 - valid
2007/8694/a - valid
1999/1/x - valid
323/323232323 - not valid
2000/11/xzxz - not valid
12xx/12xx - not valid
2010/12xx12/a - not valid

my pattern is...

/[0-9]{4}\/[0-9]{1,5}|(\/[a-zA-Z]{1})/

but doesnt work well..

anything i missed that can improve my pattern?

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4655276 posted 9:46 am on Mar 19, 2014 (gmt 0)

Woo hoo, my favorite question... only I'm not sure what the intended pattern is. Dates, OK:

\b(199|20\d)\d/

... but then what? And what's the significance of {1}? You don't mean "no more than one", because several of your specimens have lots of characters. And the pattern doesn't allow for forms that have only two sets of numbers, so you'll need a
(/blahblah)?
at the end. Probably also a \b if one criterion is number of characters.

Can you explain a little better in English what the rule is? Does "x" mean the letter x, or any alphetic, or any character at all? Do you need to exclude any possible four-number sets at the beginning?

camilord

5+ Year Member



 
Msg#: 4655276 posted 11:07 am on Mar 19, 2014 (gmt 0)

yey! a regex guro.. hehehe.. thanks advance..

the format is..

nnnn/n
nnnn/nn
nnnn/nnn
nnnn/nnnn
nnnn/nnnnn

nnnn/n/a
nnnn/nn/a
nnnn/nnn/a
nnnn/nnnn/a
nnnn/nnnnn/a

n - numbers
a - alphabet

that's the pattern should be followed...

Readie

WebmasterWorld Senior Member



 
Msg#: 4655276 posted 1:04 pm on Mar 19, 2014 (gmt 0)

nnnn/n
nnnn/nn
nnnn/nnn
nnnn/nnnn
nnnn/nnnnn

nnnn/n/a
nnnn/nn/a
nnnn/nnn/a
nnnn/nnnn/a
nnnn/nnnnn/a


Note: Using tilde's instead of forward slashes as delimiters due to you having forward slashes in what you're matching against.

~\d{4}/\d{1,5}(?:/[a-Z])?~

Should match all the above rules you stated.

I replaced your "or" with an optional sub-expression:

(?:/[a-Z])?

Everything in those brackets is optional. The leading ?: I used in my solution above prevents that sub-expression from returning a back reference.

The reason I made this change is you were essentially doing 2 matches. 1 match would match this bit: "nnnn/nn" - and then the next match would catch "/a" - meaning that an innocent "/a" by itself would also be matched. My suggestion would just match the whole thing in one hit.

\d is an alias for [0-9] and I feel is easier to read. [a-Z] is synonymous with [a-zA-Z] and again - I feel it is easier to read.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4655276 posted 7:03 pm on Mar 19, 2014 (gmt 0)

[a-Z] is synonymous with [a-zA-Z]

Not exactly, unless php operates on weird rules of its own.
[A-z]
would be synonymous with [A-Z\[\\\]\^_`a-z]
(neatly encompassing most escapable RegEx characters!) Contrariwise
[a-Z]
is synonymous with nothing-- and may even create an error condition, since capital letters precede small ones in any script that has both.

If you're covering
1234/123
but want to exclude
1234/12345678
then you'll need a \b at the end.

You never mentioned the possibility of more than 4 numerals in the first part. If it simply doesn't occur, you don't need to code for it. If you do have forms like
123456/123
then you'll need a \b at the beginning too.

/b means "word boundary", i.e. the character on one side is a \w word character (alphanumeric plus lowline) while the character on the other side is a non-word character or nothing.

camilord

5+ Year Member



 
Msg#: 4655276 posted 9:30 pm on Mar 19, 2014 (gmt 0)

@Readie:

$app_selwynid = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a');

foreach ($app_selwynid as $app_id) {

//if (preg_match("/[0-9]{4}\/[0-9]{1,5}|(\/[a-zA-Z]{1})/",$app_id)) {
//preg_match("/\d{4}\/\d{1,5}(?:\/[a-Z])?/"
//preg_match("~\d{4}\/\d{1,5}(?:\/[a-Z])?~"
if (@preg_match("/~\d{4}\/\d{1,5}(?:\/[a-Z])?~/", $app_id)) {

echo $app_id.' - valid bc number'."\n\n";

} else {

echo $app_id.' - invalid bc number'."\n\n";

}
}


2014/931 - invalid bc number

2013/8809 - invalid bc number

2007/8694/a - invalid bc number

1999/1/a - invalid bc number

323/323232323 - invalid bc number

2000/11/xzxz - invalid bc number

12xx/12xx - invalid bc number

2010/12xx12/a - invalid bc number



doesnt work.. :(

camilord

5+ Year Member



 
Msg#: 4655276 posted 9:55 pm on Mar 19, 2014 (gmt 0)

$app_selwynid = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a','1212121/121','12121212/1211/w','1234/12/');

foreach ($app_selwynid as $app_id) {

if (preg_match("/\b[0-9]{4}\/[0-9]{1,5}\b(|\/[a-zA-Z]{1}\b)?/",$app_id)) {

echo $app_id.' - valid bc number'."\n\n";

} else {

echo $app_id.' - invalid bc number'."\n\n";

}
}


2014/931 - valid bc number

2013/8809 - valid bc number

2007/8694/a - valid bc number

1999/1/a - valid bc number

323/323232323 - invalid bc number

2000/11/xzxz - valid bc number [should be invalid]

12xx/12xx - invalid bc number

2010/12xx12/a - invalid bc number

1212121/121 - invalid bc number

12121212/1211/w - invalid bc number

1234/12/ - valid bc number [should be invalid]


i think i'm almost get it. hehehe..

camilord

5+ Year Member



 
Msg#: 4655276 posted 9:16 am on Mar 20, 2014 (gmt 0)

@Readie:

i tried your pattern.. doesn't work..

Warning: preg_match_all(): Compilation failed: range out of order in character class at offset 20 in E:\localhost\alpha1_v1\test.php on line 13
2014/931 - invalid bc number


Warning: preg_match(): Compilation failed: range out of order in character class at offset 20 in E:\localhost\alpha1_v1\test.php on line 13
2013/8809 - invalid bc number

Readie

WebmasterWorld Senior Member



 
Msg#: 4655276 posted 1:10 pm on Mar 20, 2014 (gmt 0)

It was my bad Camilord, posted this without testing - Lucy24 caught my mistake:

[A-z]
would be synonymous with [A-Z\[\\\]\^_`a-z]
(neatly encompassing most escapable RegEx characters!) Contrariwise
[a-Z]
is synonymous with nothing

(I was also not aware that A-z incorporated anything other than alpha characters, so thanks for sharing that bit :))

The following does not error:

$testArray = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a');
foreach($testArray as $testItem) {
echo "{$testItem}: " . ((preg_match('~\d{4}/\d{1,5}(?:/[A-z])?~', $testItem))? 'Valid' : 'Invalid') . "\n";
}

Due to the extra characters A-z pulls in, it may be better to switch back to a-zA-Z.

Output:

2014/931: Valid
2013/8809: Valid
2007/8694/a: Valid
1999/1/a: Valid
323/323232323: Invalid
2000/11/xzxz: Valid
12xx/12xx: Invalid
2010/12xx12/a: Valid

camilord

5+ Year Member



 
Msg#: 4655276 posted 10:19 am on Mar 21, 2014 (gmt 0)

@readie:

your pattern is not working as i wanted to..

2014/931: Valid
2013/8809: Valid
2007/8694/a: Valid
1999/1/a: Valid
323/323232323: Invalid
2000/11/xzxz: Valid -- this should be invalid
12xx/12xx: Invalid
2010/12xx12/a: Valid -- this should be invalid


only one alphabet on the last but /a is optional too...

Readie

WebmasterWorld Senior Member



 
Msg#: 4655276 posted 1:12 pm on Mar 21, 2014 (gmt 0)

Ahh, simple problem there - need the start of string and end of string markers.

Currently it's just saying "there is a valid string inside this string"

~^\d{4}/\d{1,5}(?:/[A-z])?$~

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4655276 posted 9:10 pm on Mar 21, 2014 (gmt 0)

need the start of string and end of string markers

Did you ever explain what context these rules are happening in? I've been using /b on the assumption that you're picking items out of a continuous body of text-- for example, grabbing existing page references and turning them into links. If, instead, each element is already a freestanding term, then ^anchor$ is all you need.

camilord

5+ Year Member



 
Msg#: 4655276 posted 4:40 am on Mar 22, 2014 (gmt 0)

thank you so much Readie..

it works perfectly...

finally! working well.. weeeeeehhh..

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved