homepage Welcome to WebmasterWorld Guest from 54.237.249.10
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Code, Content, and Presentation / JavaScript and AJAX
Forum Library, Charter, Moderator: open

JavaScript and AJAX Forum

    
regex question
skoff




msg:4456719
 12:16 am on May 23, 2012 (gmt 0)

I have this regex validation but the thing is that i need to accept accent like etc...

What i have so far is this :
/^[A-Z][a-zA-Z '&-]*[A-Za-z]$/

i need to add accent anywhere in the string. This is something i found so i dont know much how to add this.

thanks!

 

lucy24




msg:4456832
 6:39 am on May 23, 2012 (gmt 0)

As far as Regular Expressions are concerned, an accent (do you mean ' or or or something else?) is just another character. When you say "anywhere in the string" do you mean anywhere including first and last, or only in the middle? Your example looks perfectly reasonable-- but note that if you're doing this in javascript you may need to excape the space in that middle group.

Now, what does make me uneasy is that your example word appears to be in Hebrew, and I would really really like to know how you prevented it from turning into numerical entities the way everyone else's non-Latin-1 text does.

I would also really like to know why this window has seen fit to use serif type instead of the usual sans-serif-- AND smart quotes, which my browser doesn't even have-- and can't help wonder if they are all related.

SteveWh




msg:4456862
 8:54 am on May 23, 2012 (gmt 0)

If it's the accented characters (accented versions of e, for example) that you want to add to the regex, if your page is UTF-8, you can put the chars directly into your regex character classes. You just need a way to generate them with your keyboard, or you can copy and paste into the code.

A more universal way to include them is with the \x{0000} notation in the regex (the way to specify a Unicode code point). Replace 0000 with the 4-digit Unicode code point of the character.

With a quick search, it looks like maybe \u0000 notation is equivalent to \x{0000}, but I've never used that notation.

This table probably has all the code points you need:
[en.wikipedia.org...]

I suspect you can use these notations to define ranges, too, such as:
\x{00E8}-\x{00EB}

lucy24




msg:4456886
 10:44 am on May 23, 2012 (gmt 0)

A more universal way to include them is with the \x{0000} notation in the regex (the way to specify a Unicode code point). Replace 0000 with the 4-digit Unicode code point of the character.

With a quick search, it looks like maybe \u0000 notation is equivalent to \x{0000}, but I've never used that notation.

The exact format is flavor-specific.

:: shuffling papers ::

[regular-expressions.info...] and scroll way, way down to "Unicode Characters". And there are more variations. But forms like \x{05D9} are pretty clunky if you're going to run up a string of them.

I'm still musing over the OP, which isn't an accent at all is it? It's a, uhm, yudh. Or possibly a glottal stop.

You just need a way to generate them with your keyboard

Somehow I don't think this will be a problem.

Incidentally, many RegEx dialects will also let you flag scripts by name, for example \p{Latin} or \p{Canadian_Aboriginal} or \p{Hebrew}, if that's what you need. Exact syntax and punctuation is again flavor-specific.

:: detour to check continuing puzzler ::

Good grief. This page has seen fit to use Windows-Hebrew character encoding. How on earth did it arrive at that? That is, it's oviously correct, but how did the browser guess?

SteveWh




msg:4457270
 8:24 am on May 24, 2012 (gmt 0)

On that comparison table, ECMA is Javascript, so the \x{0000} notation is apparently not going to work, but \u0000 should.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / JavaScript and AJAX
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved