Forum Moderators: coopster

Message Too Old, No Replies

RegExp Problem - Matching a "+" sign

Not working even when escaped

         

trillianjedi

5:34 pm on Apr 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Can anyone help me with this problem? I'm doing this in PHP, but it even won't work at the command line with MySQL - I'm getting a "ERROR 1139 (42000): Got error 'repetition-operator operand invalid' from regexp" error.

I'm trying to match a string which contains a "+" sign, but it won't have it, even if I escape the plus sign:-

mysql> select ("+12312345" REGEXP "\+12");

ERROR 1139 (42000): Got error 'repetition-operator operand invalid' from regexp

I basically need to match anything which contains "+12".

Thanks!

TJ

eelixduppy

5:35 pm on Apr 18, 2007 (gmt 0)



I have to run to my next class, but try something like this ;)

select ("+12312345" REGEXP "\\\+12");

I hope this is correct; I looked at it for two seconds hehe

trillianjedi

5:39 pm on Apr 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey, that was quick ;)

Thanks. That works. But I'm confused... maybe sometime you can explain to me why!

Enjoy class!

TJ

pinterface

6:53 pm on Apr 18, 2007 (gmt 0)

10+ Year Member



The reason it still doesn't work when escaped is because this isn't actually escaping anything:
mysql> select ("+12312345" REGEXP "\+12");

    What's the regular-expression escape character? <
    \
    >
    What's the string escape character? <
    \
    >
    What happens when you escape a character unnecessarily? Depends on the language:
      PHP:
      "\+"
      => <
      \+
      >
      MySQL:
      "\+"
      => <
      +
      >
      Mythical Ideal:
      "\+"
      => Parse Error

    Let's walk briefly through what happens:
      bad:
      select ("+12312345" REGEXP "\+12");

      eelix:
      select ("+12312345" REGEXP "\\\+12");

      good:
      select ("+12312345" REGEXP "\\+12");
    First, the string itself is read, and any applicable string-escapes occur:
      bad:
      select (<+12312345> REGEXP <+12>);

      eelix:
      select (<+12312345> REGEXP <\+12>);

      good:
      select (<+12312345> REGEXP <\+12>);
    Then it parses the regular expression:
      bad: ERROR!
      eelix:
      select (<+12312345> REGEXP /\+12/);

      good:
      select (<+12312345> REGEXP /\+12/);

This knowledge becomes even more important when you're looking at running the query through, say, PHP and thus need yet another level of string-escapes:

  1. mysql_query('select ("+12312345" REGEXP "\\\\+12");');

  2. mysql_query(<select ("+12312345" REGEXP "\\+12");>);

  3. mysql_query(<select (<+12312345> REGEXP <\+12>);>);

  4. mysql_query(<select (<+12312345> REGEXP /\+12/);>);

Another (probably better and less confusing) way to look at it is to start at the regular expression and work your way back to the string, adding escapes each time:

  1. Regexp (escape +): /\+12/
  2. MySQL String (escape \): "\\+12"
  3. PHP String (escape \): '"\\\\+12"'

[edited by: eelixduppy at 6:55 pm (utc) on April 18, 2007]
[edit reason] disabled smile faces [/edit]

eelixduppy

6:54 pm on Apr 18, 2007 (gmt 0)



hehe To be honest, in my haste I have written an answer that I did not want to write, but apparently it works....

I'll give you an explaination as to why your original doesn't work, and then you can experiment with two or three slashes ;)

The reason why the original wasn't working is because the mysql parser interprets one of the slashes, and then the regex has nothing to interpret. So you see, mysql parses it before the regex, so the slash is interpretted by the wrong thing. To get around this I know of two ways. The solution that I posted above , which I originally wanted to post this:


select ("+12312345" REGEXP "\\+12");

should work, but also something like this should work:


select ("+12312345" REGEXP "[+]12");

And this is because the character class doesn't need the plus-sign to be escaped.

I've confused myself now about using two or three slashes but I believe it is two. If the three-slash solution in my first response works for you, then...well....it works ;)

eelixduppy

6:59 pm on Apr 18, 2007 (gmt 0)



Looks like I was beaten to it ;)

Nice post, pinterface, thanks :)

I believe it is briefly mentioned in the documentation [dev.mysql.com], too.

coopster

6:25 am on Apr 19, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Another old post that has some relevancy here ...
[webmasterworld.com...]

And the MySQL manual page linked from that old thread:
[dev.mysql.com...]

trillianjedi

11:02 am on Apr 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



pinterface, what a blinding post. Many thanks.

Thanks eelix and Coopster too - always good to learn a new trick. I don't need to make that mistake again now... :)