homepage Welcome to WebmasterWorld Guest from 54.234.228.64
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
Regex question, suggestions please.
Matthew1980




msg:4410436
 9:32 pm on Jan 24, 2012 (gmt 0)

Hi there People,

As you all know, I'm useless at regex; though I am attempting to learn it, but it still eludes me..

I have an inputbox which the user is restricted to chars that they can enter, BUT I would like to determine that they enter the string in the correct order.

Example:-

(this is the format that I expect the user to submit)
DR/NA025/02

As you can see this could be entered in many different permutations, so I am wondering if there would be a regex way of handling this phrase - IE: only being able to submit this string in this format.

So in laymans: 2 aplha chars single / 2 alpha chars + 3 digits single / + 2 digits

This is what I have managed up to yet: ^[DR]/[NA][\d{3}]/[\d{2}]$

I'm sure as it's wrong, but I can't fathom it out...

Any thoughts/ideas/suggestions appreciated.

Cheers,
MRb

[edited by: Matthew1980 at 9:55 pm (utc) on Jan 24, 2012]

 

g1smd




msg:4410438
 9:40 pm on Jan 24, 2012 (gmt 0)

[A-Z] matches upper case
[a-z] matches lower case
[0-9] matches a digit

{2} matches two of the previous character grouping
{3} matches three

/ matches a literal slash

^ and $ are the anchoring.

Matthew1980




msg:4410444
 9:59 pm on Jan 24, 2012 (gmt 0)

Hi there,

So something like ^[A-Z]/[A-Z][0-9]{03}/{02}$ would be about right? I'll try that now.

Thanks for the pointer.

Cheers,
MRb

g1smd




msg:4410446
 10:06 pm on Jan 24, 2012 (gmt 0)

No.

The counts are used directly after the type of thing you want to match i.e. the character groups, otherwise the character groups each match only a single character.

Matthew1980




msg:4410448
 10:11 pm on Jan 24, 2012 (gmt 0)

My appologies for seeming dumb at this, but I've never really had to use it until I was giving this project.

[0-9{3}]

I've split it into smaller chunks and I've only to get this part to work, I really appreciate this help (I'll assume as you got my tweet just..)

Cheers once again,
MRb

g1smd




msg:4410449
 10:35 pm on Jan 24, 2012 (gmt 0)

[0-9{3}] matches a single instance of 0 1 2 3 4 5 6 7 8 9 { or }

[...] is a character group - it lists the characters that are allowed.

{2} says how many occurences of those characters it should match.

/[0-9]{2} matches a slash followed by two digits.

Matthew1980




msg:4410458
 10:57 pm on Jan 24, 2012 (gmt 0)

Hi there,

Thanks for your help, I ended up with this: [A-Z]/[A-Z]{2}[0-9]{3}/[0-9]{2}

Works perfectly.

Cheers,
MRb

g1smd




msg:4410461
 11:12 pm on Jan 24, 2012 (gmt 0)

If your pattern is start-anchored then it will match only a single character before the first slash. Your example showed two characters.

If there is no start anchoring it will match any number of characters before the first slash, but only if the last one before the slash is an upper case letter.

If there is no end anchoring it will match any number of characters after the final slash and two digits have been matched.

Matthew1980




msg:4410562
 6:02 am on Jan 25, 2012 (gmt 0)

Hi g1smd,

Well thanks for pointing that out, I hadn't noticed that last night.

This is what I have ended up with, works perfectly :) ^[A-Z]{2}/[A-Z]{2}[0-9]{3}/[0-9]{2}$, I guess that this can be further optimised using some \d and \w tags in there, but until I read more up on this, this will do just fine.

Cheers for all your help.

MRb

g1smd




msg:4410581
 7:45 am on Jan 25, 2012 (gmt 0)

Yah! You got there in the end, but the important point is that you probably now understand your code.

One thing about testing. When you test some code, don't just try values which you expect to work, also try a whole bunch of things that should not work and make sure that is actually the case. Try various combinations of invalid characters (even punctuation) and both too many and too few characters.

Finally, although you used [A-Z] in the code, if the actual range of allowed letters is small you could list those instead, e.g. ^[DNST][RTX]/ etc.

Matthew1980




msg:4410802
 7:40 pm on Jan 25, 2012 (gmt 0)

Hello g1smd,

Cheers! And yes you're right, my comprehension of regex (even though a simple expression) is now better for it! Yep I sat there until the wee hours testing every permutation of the string as I could think of and the pattern works great. I shall just stream line it by doing something with the \d instead of [0-9].

In this particular context, designed through vb.net I have complete control over what chars/ints are allowed in the text area via KeyPressed event.

Excellent, and again, thanks. I did tweet you with a thanks and WebmasterWorld tag ;)

Regards,
MRb

g1smd




msg:4410811
 7:54 pm on Jan 25, 2012 (gmt 0)

Cheers! IMO, [0-9] is more portable, and not worth changing to \d here.

Didn't see anything on Twitter.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved