Welcome to WebmasterWorld Guest from 54.158.175.78

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Regex question, suggestions please.

   
9:32 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hi there People,

As you all know, I'm useless at regex; though I am attempting to learn it, but it still eludes me..

I have an inputbox which the user is restricted to chars that they can enter, BUT I would like to determine that they enter the string in the correct order.

Example:-

(this is the format that I expect the user to submit)
DR/NA025/02

As you can see this could be entered in many different permutations, so I am wondering if there would be a regex way of handling this phrase - IE: only being able to submit this string in this format.

So in laymans: 2 aplha chars single / 2 alpha chars + 3 digits single / + 2 digits

This is what I have managed up to yet: ^[DR]/[NA][\d{3}]/[\d{2}]$

I'm sure as it's wrong, but I can't fathom it out...

Any thoughts/ideas/suggestions appreciated.

Cheers,
MRb

[edited by: Matthew1980 at 9:55 pm (utc) on Jan 24, 2012]

9:40 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



[A-Z] matches upper case
[a-z] matches lower case
[0-9] matches a digit

{2} matches two of the previous character grouping
{3} matches three

/ matches a literal slash

^ and $ are the anchoring.
9:59 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hi there,

So something like ^[A-Z]/[A-Z][0-9]{03}/{02}$ would be about right? I'll try that now.

Thanks for the pointer.

Cheers,
MRb
10:06 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



No.

The counts are used directly after the type of thing you want to match i.e. the character groups, otherwise the character groups each match only a single character.
10:11 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



My appologies for seeming dumb at this, but I've never really had to use it until I was giving this project.

[0-9{3}]

I've split it into smaller chunks and I've only to get this part to work, I really appreciate this help (I'll assume as you got my tweet just..)

Cheers once again,
MRb
10:35 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



[0-9{3}] matches a single instance of 0 1 2 3 4 5 6 7 8 9 { or }

[...] is a character group - it lists the characters that are allowed.

{2} says how many occurences of those characters it should match.

/[0-9]{2} matches a slash followed by two digits.
10:57 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hi there,

Thanks for your help, I ended up with this: [A-Z]/[A-Z]{2}[0-9]{3}/[0-9]{2}

Works perfectly.

Cheers,
MRb
11:12 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If your pattern is start-anchored then it will match only a single character before the first slash. Your example showed two characters.

If there is no start anchoring it will match any number of characters before the first slash, but only if the last one before the slash is an upper case letter.

If there is no end anchoring it will match any number of characters after the final slash and two digits have been matched.
6:02 am on Jan 25, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hi g1smd,

Well thanks for pointing that out, I hadn't noticed that last night.

This is what I have ended up with, works perfectly :) ^[A-Z]{2}/[A-Z]{2}[0-9]{3}/[0-9]{2}$, I guess that this can be further optimised using some \d and \w tags in there, but until I read more up on this, this will do just fine.

Cheers for all your help.

MRb
7:45 am on Jan 25, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Yah! You got there in the end, but the important point is that you probably now understand your code.

One thing about testing. When you test some code, don't just try values which you expect to work, also try a whole bunch of things that should not work and make sure that is actually the case. Try various combinations of invalid characters (even punctuation) and both too many and too few characters.

Finally, although you used [A-Z] in the code, if the actual range of allowed letters is small you could list those instead, e.g. ^[DNST][RTX]/ etc.
7:40 pm on Jan 25, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hello g1smd,

Cheers! And yes you're right, my comprehension of regex (even though a simple expression) is now better for it! Yep I sat there until the wee hours testing every permutation of the string as I could think of and the pattern works great. I shall just stream line it by doing something with the \d instead of [0-9].

In this particular context, designed through vb.net I have complete control over what chars/ints are allowed in the text area via KeyPressed event.

Excellent, and again, thanks. I did tweet you with a thanks and WebmasterWorld tag ;)

Regards,
MRb
7:54 pm on Jan 25, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Cheers! IMO, [0-9] is more portable, and not worth changing to \d here.

Didn't see anything on Twitter.