Welcome to WebmasterWorld Guest from 54.167.75.155 register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts, Become a Pro Member
 Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

# PHP Server Side Scripting Forum

 Tweet
regex help
camilord

msg:4655278
8:24 am on Mar 19, 2014 (gmt 0)

hello...

been creating my own regex pattern.. but no success... need help from you guys..

i have this samples:

2014/931 - valid
2013/8809 - valid
2007/8694/a - valid
1999/1/x - valid
323/323232323 - not valid
2000/11/xzxz - not valid
12xx/12xx - not valid
2010/12xx12/a - not valid

my pattern is...

/[0-9]{4}\/[0-9]{1,5}|(\/[a-zA-Z]{1})/

but doesnt work well..

anything i missed that can improve my pattern?

lucy24

msg:4655301
9:46 am on Mar 19, 2014 (gmt 0)

Woo hoo, my favorite question... only I'm not sure what the intended pattern is. Dates, OK:

\b(199|20\d)\d/

... but then what? And what's the significance of {1}? You don't mean "no more than one", because several of your specimens have lots of characters. And the pattern doesn't allow for forms that have only two sets of numbers, so you'll need a
(/blahblah)?
at the end. Probably also a \b if one criterion is number of characters.

Can you explain a little better in English what the rule is? Does "x" mean the letter x, or any alphetic, or any character at all? Do you need to exclude any possible four-number sets at the beginning?

camilord

msg:4655331
11:07 am on Mar 19, 2014 (gmt 0)

yey! a regex guro.. hehehe.. thanks advance..

the format is..

nnnn/n
nnnn/nn
nnnn/nnn
nnnn/nnnn
nnnn/nnnnn

nnnn/n/a
nnnn/nn/a
nnnn/nnn/a
nnnn/nnnn/a
nnnn/nnnnn/a

n - numbers
a - alphabet

that's the pattern should be followed...

msg:4655354
1:04 pm on Mar 19, 2014 (gmt 0)

 nnnn/n nnnn/nn nnnn/nnn nnnn/nnnn nnnn/nnnnn nnnn/n/a nnnn/nn/a nnnn/nnn/a nnnn/nnnn/a nnnn/nnnnn/a

Note: Using tilde's instead of forward slashes as delimiters due to you having forward slashes in what you're matching against.

~\d{4}/\d{1,5}(?:/[a-Z])?~

Should match all the above rules you stated.

I replaced your "or" with an optional sub-expression:

(?:/[a-Z])?

Everything in those brackets is optional. The leading ?: I used in my solution above prevents that sub-expression from returning a back reference.

The reason I made this change is you were essentially doing 2 matches. 1 match would match this bit: "nnnn/nn" - and then the next match would catch "/a" - meaning that an innocent "/a" by itself would also be matched. My suggestion would just match the whole thing in one hit.

\d is an alias for [0-9] and I feel is easier to read. [a-Z] is synonymous with [a-zA-Z] and again - I feel it is easier to read.

lucy24

msg:4655483
7:03 pm on Mar 19, 2014 (gmt 0)

 [a-Z] is synonymous with [a-zA-Z]

Not exactly, unless php operates on weird rules of its own.
[A-z]
would be synonymous with [A-Z$\\$\^_a-z]
(neatly encompassing most escapable RegEx characters!) Contrariwise
[a-Z]
is synonymous with nothing-- and may even create an error condition, since capital letters precede small ones in any script that has both.

If you're covering
1234/123
but want to exclude
1234/12345678
then you'll need a \b at the end.

You never mentioned the possibility of more than 4 numerals in the first part. If it simply doesn't occur, you don't need to code for it. If you do have forms like
123456/123
then you'll need a \b at the beginning too.

/b means "word boundary", i.e. the character on one side is a \w word character (alphanumeric plus lowline) while the character on the other side is a non-word character or nothing.

camilord

msg:4655549
9:30 pm on Mar 19, 2014 (gmt 0)

$app_selwynid = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a');foreach ($app_selwynid as $app_id) { //if (preg_match("/[0-9]{4}\/[0-9]{1,5}|(\/[a-zA-Z]{1})/",$app_id)) { //preg_match("/\d{4}\/\d{1,5}(?:\/[a-Z])?/" //preg_match("~\d{4}\/\d{1,5}(?:\/[a-Z])?~" if (@preg_match("/~\d{4}\/\d{1,5}(?:\/[a-Z])?~/", $app_id)) { echo$app_id.' - valid bc number'."\n\n"; } else { echo $app_id.' - invalid bc number'."\n\n"; }}  2014/931 - invalid bc number 2013/8809 - invalid bc number 2007/8694/a - invalid bc number 1999/1/a - invalid bc number 323/323232323 - invalid bc number 2000/11/xzxz - invalid bc number 12xx/12xx - invalid bc number 2010/12xx12/a - invalid bc number doesnt work.. :( camilord msg:4655557 9:55 pm on Mar 19, 2014 (gmt 0) $app_selwynid = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a','1212121/121','12121212/1211/w','1234/12/');foreach ($app_selwynid as$app_id) { if (preg_match("/\b[0-9]{4}\/[0-9]{1,5}\b(|\/[a-zA-Z]{1}\b)?/",$app_id)) { echo$app_id.' - valid bc number'."\n\n"; } else { echo $app_id.' - invalid bc number'."\n\n"; }}  2014/931 - valid bc number 2013/8809 - valid bc number 2007/8694/a - valid bc number 1999/1/a - valid bc number 323/323232323 - invalid bc number 2000/11/xzxz - valid bc number [should be invalid] 12xx/12xx - invalid bc number 2010/12xx12/a - invalid bc number 1212121/121 - invalid bc number 12121212/1211/w - invalid bc number 1234/12/ - valid bc number [should be invalid] i think i'm almost get it. hehehe.. camilord msg:4655674 9:16 am on Mar 20, 2014 (gmt 0) @Readie: i tried your pattern.. doesn't work..  Warning: preg_match_all(): Compilation failed: range out of order in character class at offset 20 in E:\localhost\alpha1_v1\test.php on line 13 2014/931 - invalid bc number Warning: preg_match(): Compilation failed: range out of order in character class at offset 20 in E:\localhost\alpha1_v1\test.php on line 13 2013/8809 - invalid bc number Readie msg:4655737 1:10 pm on Mar 20, 2014 (gmt 0) It was my bad Camilord, posted this without testing - Lucy24 caught my mistake:  [A-z] would be synonymous with [A-Z$\\$\^_a-z] (neatly encompassing most escapable RegEx characters!) Contrariwise [a-Z] is synonymous with nothing (I was also not aware that A-z incorporated anything other than alpha characters, so thanks for sharing that bit :)) The following does not error:  $testArray = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a');foreach($testArray as$testItem) { echo "{$testItem}: " . ((preg_match('~\d{4}/\d{1,5}(?:/[A-z])?~',$testItem))? 'Valid' : 'Invalid') . "\n";}

Due to the extra characters A-z pulls in, it may be better to switch back to a-zA-Z.

Output:

 2014/931: Valid 2013/8809: Valid 2007/8694/a: Valid 1999/1/a: Valid 323/323232323: Invalid 2000/11/xzxz: Valid 12xx/12xx: Invalid 2010/12xx12/a: Valid

camilord

msg:4655952
10:19 am on Mar 21, 2014 (gmt 0)

your pattern is not working as i wanted to..

 2014/931: Valid 2013/8809: Valid 2007/8694/a: Valid 1999/1/a: Valid 323/323232323: Invalid 2000/11/xzxz: Valid -- this should be invalid 12xx/12xx: Invalid 2010/12xx12/a: Valid -- this should be invalid

only one alphabet on the last but /a is optional too...

msg:4655977
1:12 pm on Mar 21, 2014 (gmt 0)

Ahh, simple problem there - need the start of string and end of string markers.

Currently it's just saying "there is a valid string inside this string"

~^\d{4}/\d{1,5}(?:/[A-z])?$~ lucy24 msg:4656066 9:10 pm on Mar 21, 2014 (gmt 0)  need the start of string and end of string markers Did you ever explain what context these rules are happening in? I've been using /b on the assumption that you're picking items out of a continuous body of text-- for example, grabbing existing page references and turning them into links. If, instead, each element is already a freestanding term, then ^anchor$ is all you need.

camilord

msg:4656160
4:40 am on Mar 22, 2014 (gmt 0)

it works perfectly...

finally! working well.. weeeeeehhh..

 Global Options: top home search open messages active posts  Tweet

 Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting