Forum Moderators: coopster

Message Too Old, No Replies

best way to search a string for a bunch of other strings

any nice fast function, or am I going to have to loop and preg?

         

mincklerstraat

1:27 pm on Jan 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's 2PM here and my brain still doesn't seem to have shown up for work yet, it's doing absolutely nothing with this seemingly easy question, hopefully someone here can help.

I've got this array of strings, you see, and I need to keep the ones which contain any of a number of other strings. I'm thinking, I need something like:


$needlearray = ('peanut', 'butter', 'jelly');
foreach($haystackarray as $k=>$v){
if(strpos($v, $needlearray)) $newarray[] = $v;
}

where strpos accepts an array as its needle arg (which it doesn't). Wouldn't one think there'd be some nice strposneedlearray() function, sweet and jiffy-quick?

Instead, I think I'm looking at either a double loop, looping through the haystackarray inside the needle array, or else going through the haystack array like:


foreach($haystackarray as $k=>$v){
if(preg_match('#(peanut¦butter¦jelly)#', $v) $newarray[] = $v;
}

Looping through a processor-eating regex to do this easy thing just seems atrocious. Someone give this worthless soggy bag of grey wrinkles a kick in the right direction please.

jatar_k

5:47 pm on Jan 11, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I rolled this around in my head for the last hour or so, checked out a few functions. I worked out about 12 ways to do it with exact match but with with partials in strings I think you are stuck with the 2 options you mentioned.

I would time the 2 and take the quicker one but I would think the double loop should be faster. It may depend on the number of elements in each array though.

mincklerstraat

6:07 pm on Jan 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks a lot for rolling this one around in your head, jatar. I was beginning to feel like a complete moron; feeling now more just like a partial moron. I'll go this route and wish for the best - I already know that the regex version won't *utterly crash* my system, but people who use this script probably aren't spoiling their php webservers rotten with all the resources I'm lavishing mine with. Will give the double loop a go too - good chance it will be faster.

jatar_k

6:18 pm on Jan 11, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I was looking at something like array_walk_recursive [ca.php.net] if you have php 5.

having your needle bit compare a single array element with all needles, then onto the next. It might speed it up a bit if it is a native function.

I was also thinking of ways for array intersects and all that but it doesn't work with substrings that I could think of.

mincklerstraat

6:59 pm on Jan 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey, good guess re. recursivity and array intersects, I'd been wanting that, but decided to go simpler for this particular function and keep to first-level elements in arrays - that also solves the array intersects problem. Script in question is more of a utility so it can be on the heavier side, usually no concurrent users, but still, fast is nice and there's other heavy stuff on this script that silly users could crash their systems with if they do too many things at once.

Regarding array intersects, one aweful-aweful klugey approach I'd thought of just uses PHP's own function print_r which deals nicely with (infinite) recursion (forget as of which version - I remember back 'round 4.06 either print_r or var_dump would crash if you used $GLOBALS as an argument, I'm hoping 4.10 got this solved, that's my backwards-compat goal so can't use 5). If you really, absolutely, necessarily must search the GLOBAL variables for something, you can use a buffered print_r (or print_r($var, true) as of 4.3) and PHP takes care of recursion and other problems for you - you then, of course, are straddled with the nasty task of parsing all this output, to find the parent elements, but at least only in those cases where the searchstring has actually been found (if it's not a situation where you need to find lots of multiple instances). In my case, only partially parsing the output would probably be enough, let the user figure the rest out.