Forum Moderators: coopster

Message Too Old, No Replies

Regex not matching spaces.

I'm not so good at regexes

         

dotancohen

11:33 pm on Nov 16, 2006 (gmt 0)

10+ Year Member



I'm parsing a body of text for tags that use square brackets to identify themselves, like BBCode. These tags may contain alphanumeric characters, spaces, apostrophes, underscores, dashes, basic punctuation, and pipes. The closest that I've been able to come up with is:

$text=preg_replace_callback('/\[([A-Za-z0-9\¦\'.-:underscore:]+)\]/i' , "findLinks", $text);

However, this does not match spaces for some odd reason (the "." should match them, I think). I've added "\s", "\w", " ", and ":space:" to the regex (tried both before the A-Z and after the 0-9) but for whatever reason those spaces are not detected. Why? What must I do?

I'm certain that what I think is a space is most certainly a space. str_replace(" ", "", $text); closes the spaces, so I know that php does in fact see it as a space.

I'm on php 5.x if it's relevant. Thanks in advance for any assistance.

Dotan Cohen

IamStang

4:09 am on Nov 19, 2006 (gmt 0)

10+ Year Member



I dont think that dot is gonna do it for ya. Have you tried just placing a plain ole space in your regex? I usually place mine after the 9 but it doesnt really matter.

$text=preg_replace_callback('/\[([A-Za-z0-9 \¦\'.-:underscore:]+)\]/i' , "findLinks", $text);

(note space after 9)

Hope it helps.

dotancohen

5:25 pm on Nov 19, 2006 (gmt 0)

10+ Year Member



Yes, I have tried. I tried " ", "\s", ".", but none of them would match a space- until I replaced the pipe character with a solid pipe. I had to copy and paste that pipe from a website- I have no idea how to type it.

whoisgregg

6:07 pm on Nov 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The pipe key is below the "delete" key. Or, to put it another way, Shift + \

coopster

6:10 pm on Nov 20, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Your hyphen may be your issue ...


The minus (hyphen) character can be used to specify a range of characters in a character class. For example, [d-m] matches any letter between d and m, inclusive. If a minus character is required in a class, it must be escaped with a backslash or appear in a position where it cannot be interpreted as indicating a range, typically as the first or last character in the class.

Pattern Syntax [php.net]

dotancohen

9:46 pm on Nov 20, 2006 (gmt 0)

10+ Year Member



Thanks, but it's not a standard pipe (that, I know how to type!). Apparently, there are broken pipes and solid pipes:
¦ and ¦

eelixduppy

9:53 pm on Nov 20, 2006 (gmt 0)



The only type you want to use is the unbroken pipe character which is below the "backspace" key. :) Now if I understand correctly, you were using the broken one at first and it wasn't working, and then you changed it to the solid pipe and now it is working? WebmasterWorld breaks pipe characters into the broken representation so I think everyone's confusing everything. ;)

If the solid pipe character does what you want, then I guess everything is good and dandy :)

And yes I know I use too many smilies :) ...hehe

ramoneguru

7:37 pm on Nov 22, 2006 (gmt 0)

10+ Year Member



Wait, so it was fixed? Does this "pipe" value have a UNICODE output we could take a look at, for future reference?

Note, I too have noticed that eelixduppy uses a lot of smilies.....it brightens up my day though :-) keep em coming.
--Nick

dotancohen

8:43 pm on Nov 22, 2006 (gmt 0)

10+ Year Member



No, it's the BROKEN pipe that works! I had to copy-paste it from $somethingThatIGoogled into the script. The regular pipe I know how to type, it's kind of difficult to grep without it!