Welcome to WebmasterWorld Guest from 54.144.243.34

Forum Moderators: coopster & jatar k

regex help

   
8:24 am on Mar 19, 2014 (gmt 0)

5+ Year Member



hello...

been creating my own regex pattern.. but no success... need help from you guys..

i have this samples:

2014/931 - valid
2013/8809 - valid
2007/8694/a - valid
1999/1/x - valid
323/323232323 - not valid
2000/11/xzxz - not valid
12xx/12xx - not valid
2010/12xx12/a - not valid

my pattern is...

/[0-9]{4}\/[0-9]{1,5}|(\/[a-zA-Z]{1})/

but doesnt work well..

anything i missed that can improve my pattern?
9:46 am on Mar 19, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Woo hoo, my favorite question... only I'm not sure what the intended pattern is. Dates, OK:

\b(199|20\d)\d/

... but then what? And what's the significance of {1}? You don't mean "no more than one", because several of your specimens have lots of characters. And the pattern doesn't allow for forms that have only two sets of numbers, so you'll need a
(/blahblah)?
at the end. Probably also a \b if one criterion is number of characters.

Can you explain a little better in English what the rule is? Does "x" mean the letter x, or any alphetic, or any character at all? Do you need to exclude any possible four-number sets at the beginning?
11:07 am on Mar 19, 2014 (gmt 0)

5+ Year Member



yey! a regex guro.. hehehe.. thanks advance..

the format is..

nnnn/n
nnnn/nn
nnnn/nnn
nnnn/nnnn
nnnn/nnnnn

nnnn/n/a
nnnn/nn/a
nnnn/nnn/a
nnnn/nnnn/a
nnnn/nnnnn/a

n - numbers
a - alphabet

that's the pattern should be followed...
1:04 pm on Mar 19, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



nnnn/n
nnnn/nn
nnnn/nnn
nnnn/nnnn
nnnn/nnnnn

nnnn/n/a
nnnn/nn/a
nnnn/nnn/a
nnnn/nnnn/a
nnnn/nnnnn/a


Note: Using tilde's instead of forward slashes as delimiters due to you having forward slashes in what you're matching against.

~\d{4}/\d{1,5}(?:/[a-Z])?~

Should match all the above rules you stated.

I replaced your "or" with an optional sub-expression:

(?:/[a-Z])?

Everything in those brackets is optional. The leading ?: I used in my solution above prevents that sub-expression from returning a back reference.

The reason I made this change is you were essentially doing 2 matches. 1 match would match this bit: "nnnn/nn" - and then the next match would catch "/a" - meaning that an innocent "/a" by itself would also be matched. My suggestion would just match the whole thing in one hit.

\d is an alias for [0-9] and I feel is easier to read. [a-Z] is synonymous with [a-zA-Z] and again - I feel it is easier to read.
7:03 pm on Mar 19, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



[a-Z] is synonymous with [a-zA-Z]

Not exactly, unless php operates on weird rules of its own.
[A-z]
would be synonymous with [A-Z\[\\\]\^_`a-z]
(neatly encompassing most escapable RegEx characters!) Contrariwise
[a-Z]
is synonymous with nothing-- and may even create an error condition, since capital letters precede small ones in any script that has both.

If you're covering
1234/123
but want to exclude
1234/12345678
then you'll need a \b at the end.

You never mentioned the possibility of more than 4 numerals in the first part. If it simply doesn't occur, you don't need to code for it. If you do have forms like
123456/123
then you'll need a \b at the beginning too.

/b means "word boundary", i.e. the character on one side is a \w word character (alphanumeric plus lowline) while the character on the other side is a non-word character or nothing.
9:30 pm on Mar 19, 2014 (gmt 0)

5+ Year Member



@Readie:

$app_selwynid = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a');

foreach ($app_selwynid as $app_id) {

//if (preg_match("/[0-9]{4}\/[0-9]{1,5}|(\/[a-zA-Z]{1})/",$app_id)) {
//preg_match("/\d{4}\/\d{1,5}(?:\/[a-Z])?/"
//preg_match("~\d{4}\/\d{1,5}(?:\/[a-Z])?~"
if (@preg_match("/~\d{4}\/\d{1,5}(?:\/[a-Z])?~/", $app_id)) {

echo $app_id.' - valid bc number'."\n\n";

} else {

echo $app_id.' - invalid bc number'."\n\n";

}
}


2014/931 - invalid bc number

2013/8809 - invalid bc number

2007/8694/a - invalid bc number

1999/1/a - invalid bc number

323/323232323 - invalid bc number

2000/11/xzxz - invalid bc number

12xx/12xx - invalid bc number

2010/12xx12/a - invalid bc number



doesnt work.. :(
9:55 pm on Mar 19, 2014 (gmt 0)

5+ Year Member



$app_selwynid = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a','1212121/121','12121212/1211/w','1234/12/');

foreach ($app_selwynid as $app_id) {

if (preg_match("/\b[0-9]{4}\/[0-9]{1,5}\b(|\/[a-zA-Z]{1}\b)?/",$app_id)) {

echo $app_id.' - valid bc number'."\n\n";

} else {

echo $app_id.' - invalid bc number'."\n\n";

}
}


2014/931 - valid bc number

2013/8809 - valid bc number

2007/8694/a - valid bc number

1999/1/a - valid bc number

323/323232323 - invalid bc number

2000/11/xzxz - valid bc number [should be invalid]

12xx/12xx - invalid bc number

2010/12xx12/a - invalid bc number

1212121/121 - invalid bc number

12121212/1211/w - invalid bc number

1234/12/ - valid bc number [should be invalid]


i think i'm almost get it. hehehe..
9:16 am on Mar 20, 2014 (gmt 0)

5+ Year Member



@Readie:

i tried your pattern.. doesn't work..

Warning: preg_match_all(): Compilation failed: range out of order in character class at offset 20 in E:\localhost\alpha1_v1\test.php on line 13
2014/931 - invalid bc number


Warning: preg_match(): Compilation failed: range out of order in character class at offset 20 in E:\localhost\alpha1_v1\test.php on line 13
2013/8809 - invalid bc number
1:10 pm on Mar 20, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



It was my bad Camilord, posted this without testing - Lucy24 caught my mistake:

[A-z]
would be synonymous with [A-Z\[\\\]\^_`a-z]
(neatly encompassing most escapable RegEx characters!) Contrariwise
[a-Z]
is synonymous with nothing

(I was also not aware that A-z incorporated anything other than alpha characters, so thanks for sharing that bit :))

The following does not error:

$testArray = array('2014/931','2013/8809','2007/8694/a','1999/1/a','323/323232323','2000/11/xzxz','12xx/12xx','2010/12xx12/a');
foreach($testArray as $testItem) {
echo "{$testItem}: " . ((preg_match('~\d{4}/\d{1,5}(?:/[A-z])?~', $testItem))? 'Valid' : 'Invalid') . "\n";
}

Due to the extra characters A-z pulls in, it may be better to switch back to a-zA-Z.

Output:

2014/931: Valid
2013/8809: Valid
2007/8694/a: Valid
1999/1/a: Valid
323/323232323: Invalid
2000/11/xzxz: Valid
12xx/12xx: Invalid
2010/12xx12/a: Valid
10:19 am on Mar 21, 2014 (gmt 0)

5+ Year Member



@readie:

your pattern is not working as i wanted to..

2014/931: Valid
2013/8809: Valid
2007/8694/a: Valid
1999/1/a: Valid
323/323232323: Invalid
2000/11/xzxz: Valid -- this should be invalid
12xx/12xx: Invalid
2010/12xx12/a: Valid -- this should be invalid


only one alphabet on the last but /a is optional too...
1:12 pm on Mar 21, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Ahh, simple problem there - need the start of string and end of string markers.

Currently it's just saying "there is a valid string inside this string"

~^\d{4}/\d{1,5}(?:/[A-z])?$~
9:10 pm on Mar 21, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



need the start of string and end of string markers

Did you ever explain what context these rules are happening in? I've been using /b on the assumption that you're picking items out of a continuous body of text-- for example, grabbing existing page references and turning them into links. If, instead, each element is already a freestanding term, then ^anchor$ is all you need.
4:40 am on Mar 22, 2014 (gmt 0)

5+ Year Member



thank you so much Readie..

it works perfectly...

finally! working well.. weeeeeehhh..
 

Featured Threads

Hot Threads This Week

Hot Threads This Month