Forum Moderators: coopster

Message Too Old, No Replies

Do a db search and rank it based on # of category matches?

Is this the best way to do this?

         

HughMungus

8:06 pm on Apr 19, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a database full of affiliates. Each affiliate's listing has six category fields that indicate what they sell (c1, c2, c3, etc. which are filled in with values like ridgets, gidgets, bidgets, etc.).

Currently I can do a db search that turns up all affiliates that match for each individual category (so, for example, if an affiliate's db listing includes ridgets as one of the six category fields and i search for ridgets using the site's search, i get a list of all affiliates who sell ridgets). That's easy.

Now I want to make it more complicated such that if someone searches for multiple product categories that the only affiliates that show up are the affiliates that match by product category for all the search terms (e.g., a search for ridgets AND gidgets shows affiliates that sell both).

I know how to do this type of multiple-keyword search, but, what I'm wondering is what is the best way to do a search for an arbitrary number of search terms across a list of category fields in an arbitrary order (since ridgets might be c1 on one listing and in field c2 on another and the same for gidgets, etc).

So I was thinking that I would first setup a for/each loop where each word in the search term is matched to the category fields to create a list of affiliates that match based on each word in the search term and then merge these arrays together and do a search on the array to find affiliates that have all the words in the search term as one of its category fields.

But just now I thought I could do a tiered search in php/mysql where the first sql statement checks for the first search term, the second searches for the second, and so on and whenever a query fails to find the "next" word in the searh term, that loop ends and the affiliate listing is not added to the list of affiliates so that the end result is a list of affiliates that "passed" all the search word/category checks.

Obviously the problem here is an arbitrary number of search terms being searched for in category fields where the search word(s) might appear in any order in the six category fields...and I'm trying to do this without creating a bazillion different sql queries to be run thru.

Any guidance on the best way to accomplish something like this?

jusdrum

9:05 pm on Apr 19, 2005 (gmt 0)

10+ Year Member



Does your app require that you have 6 separate fields for these items? If not, try combining them all into one field, and separate each keyword by commas. Then, use MySQL's
regexp
function to match keywords. An example (you'll probably have to modify it for your situation):

select * from table where fieldname regexp 'ridget¦gidget¦bidget';

All you have to do to build the list of words to search for is to

explode()
input, normalize, then
implode()
with a
¦
symbol.

$Input = "ridget bidget";
$Words = explode(" ",$Input);
$Words = array_map('strip_tags',$Words); // clean all HTML tags, for example of normalizing
$Search = implode("¦",$Words);

$SqlStmt = "select * from table where fieldname regexp '$Search';

Hope this helps!

HughMungus

5:40 am on Apr 20, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm having a problem with this sql statement. It's returning affiliates that have either/or not both terms...is it me?

But thanks for the reply. Rethinking this in terms of regular expressions instead of "=" or "like".

jusdrum

4:39 pm on Apr 20, 2005 (gmt 0)

10+ Year Member



You can group the results by ID, if you have a unique ID for each record.

MamaDawg

5:36 pm on Apr 20, 2005 (gmt 0)

10+ Year Member



Now I want to make it more complicated such that if someone searches for multiple product categories that the only affiliates that show up are the affiliates that match by product category for all the search terms (e.g., a search for ridgets AND gidgets shows affiliates that sell both).

Have you thought of breaking out the categories into their own table with just category and affiliate ID columns? Then you can "select distinct" from the category table joined to the affiliate table to get the data you need. Additional benefit - unlimited categories per affiliate ...

HughMungus

6:27 pm on Apr 20, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for all the suggestions.

What I ended-up doing is kinda clunky but it works.

I do a search for the first word in the search phrase. Then I take that result set and using PHP for/each to do a comparison on all the words in the search phrase. So if I get a result set that includes all the affiliates that have "ridget" in them, I take that data and do the for/each for all the search terms to then find "gidget" and "bidget" and add every match to a single array. If I know there are X number of words in the search phrase, all I have to do is do a COUNT on the array to find which affiliates appear X times in the array. This prevents me from both having to go back to the database multiple times AND prevents me from having to do an in_array or strpos for every single listing.

However, since I first posted this question, I have learned how to use regular expressions more effectively and that has helped A LOT. Thanks!

HughMungus

6:30 pm on Apr 20, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you thought of breaking out the categories into their own table with just category and affiliate ID columns? Then you can "select distinct" from the category table joined to the affiliate table to get the data you need. Additional benefit - unlimited categories per affiliate ...

Actually, since I know how to do regexp now, I'm thinking about going back to a single field with the categories. The reason I broke the categories column out into individual category columns is so that I could use ENUM to limit the cateogry names to a specific list (so that I wasn't using different names to describe the same types of affiliates). But I could also do that by using a web page interface/drop down form field for the data entry instead of direct db access...hmm....

HughMungus

9:35 pm on Apr 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Update: Decided to use a combination of a single comma-delimited category field and a sql query that uses MATCH/AGAINST to get a list of results ranked by relevance.

Thanks for the advice. I learned a lot from asking and seeing the answers.