Forum Moderators: coopster
According to the "reference.pcre.pattern.syntax" page:
-----------------------------
Unicode character properties
Since PHP 4.4.0 and 5.1.0, three additional escape sequences to match generic character types are available when UTF-8 mode is selected. They are:
\p{xx}
a character with the xx property
-----------------------------
... (provided you use the "u" modifier).
I've tried this -rather minimal- regexp :
preg_match_all("/(\p{LI}+)/u",$line,$words);
... which should match sets of lowercase letters, but I got :
Warning: preg_match_all() [function.preg-match-all]: Compilation failed: unknown property name after \P or \p at offset 6 in
/var/www/test/engine/engineGetWords.php on line 74
The '{' seems not to be supported in PHP 5.2.3. It's unexpected, as this regexp works fine, without warning:
preg_match_all("/(\pL+)/u",$line,$words);
Any explanation welcome.
Regards,
Marino
For example we would like to search for Japanese-standard circled numbers 1-9 (Unicode codes are 0x2460-0x2468) in order to make it through the hex-codes the following call should be used:
preg_match('/[\x{2460}-\x{2468}]/u', $str);
To be honest im not sure if this is the only way it can be used, but thats the only way iv used it.