I'm trying to create a simple-but-effective search algorithm for my website. First, a little background on what this service will interact with;
Our site provides online training, and users each have their own file area where they can store files that they have created as a result of their training excersizes. Whenever they submit a file, they must supply a title, plus a short(ish) description. These two elements are stored in a (SQL) database, along with a lot of other seperate values relevant to their training programme.
Now, so far I can extract all matching records from the SQL d/b based on what dropdown box selections the user has made in the "Search Files" section. But - and this is the tricky bit - they can complement their dropdown box selections by completing a "Search Words" text box.
What I would like to do is rank the corresponding results based on each files relevancy to whatever was entered into the "Search Words" form field. Below is the pseudo-code for what I'm planning to implement, and I'd like some feedback on what other users here at WebmasterWorld think;
getResults query loop -> step X
-- count instances of keyword in file_desc -> store result in keywordCount variable
-- divide number of space chars in file_desc by keywordCount -> store result in fileRelevancy variable
-- store filename, file_userpin and fileRelevancy in a 3-dimensional array
next step until getResults.recordsLeft is 0
rankingArray query loop -> step X
-- find highest value fileRelevancy property
-- output highest ranking result to screen -> delete it from array
next step until rankingArray.recordsLeft is 0
One thing you might immediately question is "why count the number of spaces in the file_desc field?" Well, this is just my Quick 'n' Dirty (tm) way of finding out how many words are in the description field - then I use the result to calculate keyword weight/frequency.
What do you think? Any and all comments greatly welcomed, and thanks in advance.