Msg#: 4225778 posted 9:41 am on Nov 3, 2010 (gmt 0)
Here is an example with a regular expression to extract portions of the string containing quotes. $search = '"one two" three'; $pattern = '/(?>")[^"]*+"/'; preg_match_all($pattern, $search, $match_array);
You could also use explode('"',$search); and that will return an array splitting the string among the double quotes and then use a counter to join the segments that appear within the string.
In general you should first filter the input submitted by the user with a pattern that contains characters you expect. That's before breaking down string into segments.
Having all the segments in an array you could feed them into your queries but the search is not too efficient using simply the %segment% patterns.
You see if you search for "one" it will also match words like everyone or someone etc. You need to setup the queries to match expected characters. So if it's to expect the word one to take place somewhere in the mid of a string and you want to search for the exact work you could add a prefix like: WHERE `myField` LIKE '% one%' That will try to match something that starts with "one" and is a bit more efficient. Of course it will not work if the sentence begins with one or if there is a different separator like two-one or (one) etc. So the original filtering method is important not only to extract the segments but to replace unwanted characters with separators before feeding it into the queries.
Finally the search process should be repeated if there is a spelling mistake by reducing the number of characters in the segments before trying again up to a number of times. Right to Left character reduction could be applied till you get a match. Limits of characters in each word can improve the search. For instance if someone searches for one or a three you could eliminate any word that is less than 3 characters.
Because the queries at the end can be long enough with the various and operators for each segment you could get the db results and then filter them for the excluded words. In other words, the exclusion operator could be processed in php. Of course depends how many results you will present if it's to use pagination for example from the returned rows, you need the NOT inside the query.
Msg#: 4225778 posted 3:08 pm on Nov 3, 2010 (gmt 0)
This one is probably "close enough"
Q1) one two three Q1) should give a simple query SELECT * FROM `myTable` WHERE `myField` LIKE '%one%' AND `myField` LIKE '%two%' AND `myField` LIKE '%three%';
But it doesn't account for whole words, it will, for example, match on "anyone" and "someone". More below. These are mostly incorrect:
Q2) "one two" three Q2) should give SELECT * FROM `myTable` WHERE `myField` LIKE '%one two%' AND `myField` LIKE '%three%'
Though the liklihood is slim, it would match on "someone two". The % is a wildcard character, when first it means "anything before" and last, "anything after." What you'd need here is some form of mySQL regexp [dev.mysql.com]. Something like
select * from `myTable` where `myField` regexp '[[:<:]]one two[[:>:]]' and `myField` like '%three%'
select * from `myTable` where `myField` regexp '[[:<:]]one two[[:>:]]' and `myField` regexp '[[:<:]]three[[:>:]]'
[[:<:]] and [[:>:]] being word boundaries. You can test this on the command line:
select 'zero one two three four' regexp '[[:<:]]one two[[:>:]]'; --> 1
select 'zero one OOPS two three four' regexp '[[:<:]]one two[[:>:]]'; -- 0
Same with this one.
Q3) one two -three Q3) should give SELECT * FROM `myTable` WHERE `myField` LIKE '%one two%' AND `myField` NOT LIKE '%three%'
select * from `myTable` where `myField` regexp '[[:<:]]one two[[:>:]]' and `myField` not regexp '[[:<:]]three[[:>:]]'
There's no sense in extracting too many results then slogging through them in your programming, this makes for excessive use of resources and slow queries. You can do the regexps directly in the selects, see link above for more options. Though they are often more difficult to compose, regexps are more efficient than 'like' anyway.