Forum Moderators: coopster

Message Too Old, No Replies

Searching through a variable and extracing values with like terms?

         

erikcw

5:06 am on Dec 12, 2004 (gmt 0)

10+ Year Member



Hi All,

Can't seem to figure out the best way to approach this task. I have a variable ($list) which contains a series of phrases which are submitted via a textarea.

$_POST['list']="exercise bike
exercise bicycle
workout bike
workout bicycle
stairmaster
step machine
etc...";

I want to separate this big undifferentiated list into separate variables. I search $list for "bicycle" - every phrase with "bicycle" in it is removed from $list and placed into another variable (or a key in an array).

What is the best way to tackle this? Should I read $list into an array (one phrase per key), and then use some php function to tackle the job? Or should I run a regex like preg_replace($list) to do the job? Which regular expression and/or function should I use?

Thanks!
Erik

coopster

1:04 pm on Dec 13, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Does explode() [php.net] get you started in the right direction?

erikcw

6:57 am on Dec 14, 2004 (gmt 0)

10+ Year Member



Thanks for the response coopster! explode is not really the function I am after though. (although I may use it in part of the script...)

What I need to do is perform some sort of search on $list, and then take each phrase (or line \n blah blah \n) put that into its own variable/array key.

Example:

Search $list: bicycle
Extract:
excercise bicycle
workout bicycle

Those two phrases could either be in the same variable, or in seperate ones... (I can always put them togeather later). I am thinking this may be a job for regex...

I guess my question is what function do I use to search my $list variable, and move those results into a new variable?

Any ideas?

erikcw

1:53 am on Dec 15, 2004 (gmt 0)

10+ Year Member



OK, I think I am going to use preg_match_all() on $list to pull out the lines which contain the search term. My question has now become, how do I write a regex pattern which will match everything from \n ... keyword ... \n? (phrase could be one word, or 10 words, but I need to match any phrase which contains the search term...

Thanks for your help!
Erik

coopster

3:22 am on Dec 15, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Oh, now I think I understand. Is this enough to get you going?
$searchterm = 'bicycle'; 
$pattern = "/([^\n]*$searchterm.*)\n/Ui";
preg_match_all($pattern, $_POST['list'], $matches);
print_r($matches[1]);
The pattern says to find anything between newlines (if there is a newline, this will catch the first item in the list as well) that contains our searchterm. The U means ungreedy and the i makes it case-insensitive.

erikcw

9:00 pm on Dec 17, 2004 (gmt 0)

10+ Year Member



Thanks coopster, that regex helped a ton! I just have one last question, what would I have to change in the regex to make it match a string for whole word only, or part of word.

For example, if I want to separate level from levels...
search: level
return: levels AND level

search: level
return: level IGNORE levels

Thanks!
Erik

coopster

2:06 pm on Dec 19, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



You could add a word boundary [php.net] if you only want to match on a whole word.

erikcw

9:19 pm on Dec 19, 2004 (gmt 0)

10+ Year Member



I added "b" on the the pattern, but I now get an error:

$searchterm = 'level';
$pattern = "/([^\n]*$searchterm.*)\n/Uib";
preg_match_all($pattern, $_POST['list'], $matches);
print_r($matches[1]);


<b>Warning</b>: Unknown modifier 'b' in <b>PHPDocument1</b> on line <b>15</b><br />

Sorry to be so "dense" with this thing.
Thanks!
Erik

coopster

1:45 pm on Dec 20, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Quite all right. A word boundary is not a modifier so it doesn't go at the end of the pattern, it's part of the pattern.
$pattern = "/([^\n]*\b$searchterm\b.*)\n/Ui";

erikcw

12:49 am on Dec 31, 2004 (gmt 0)

10+ Year Member



That works great!

Now for a related question - how do I do a "wildcard" * search with this regular expression?

Thanks!

coopster

1:36 pm on Dec 31, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Not sure I follow here...?

You know, I should have also dropped some links in here for you as well. There are quite a few links in Learning PHP - Books, Tutorials and Online Resources [webmasterworld.com] in the PHP Forum Library [webmasterworld.com]. They cover regular expression basics, etc.

Back to your question though -- the period (.) will match any character except newline (by default). And then you can follow the period with a modifier such as the asterisk (*), plus sign (+), etc.