Forum Moderators: coopster
I have a basic php/mysql search engine on my site that allows users to search through directory listings.
At present it will often return no results for searches in the plural, using punctuation etc. It requires uses to enter the exact phrase to get a match. It basically uses the MySQL LIKE clause to get a result:
SELECT data WHERE field LIKE '%searchstring%'
Clearly this approach isn't very effective. I was wondering if anyone had any simple ideas that I can implement to get better results.
Thanks
URL (varchar)
title (varchar)
metakeys (varchar)
bodytext (longtext)
Then it gets a little complicated - you'll want to familiarize yourself with the MATCH AGAINST functions in MySQL.
Create a FULLTEXT index in the database with ALTER TABLE:
FULLTEXT (title,metakeys,bodytext)
I too have been learning the more advanced functions in MySQL, and this manual [dev.mysql.com] hasn't been very helpful. But alas, this [dev.mysql.com] is the page which descibes the functions you'll need.
SELECT * FROM articles
WHERE MATCH (title,body) AGAINST ('search string');
The MATCH() function performs a natural language search for a string against a text collection. A collection is a set of one or more columns included in a FULLTEXT index. The search string is given as the argument to AGAINST(). The search is performed in case-insensitive fashion. For every row in the table, MATCH() returns a relevance value, that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.
If you want to add a little fuzziness to accept a small set of strings, you can sort hits by their levenshtein [ca3.php.net] similarity. This catches a lot of verb tenses for longer words "incorporate/incorporated" where the two words are spelled similarly.
good luck
I have been using it for three years and am very happy with it. Oh and it comes along with a php extension.
[swish-e.org...]
Hope that helps,
Saurabh.
One band-aid solution I implemented on one site was a "did you mean..." feature, where if a search resulted in 0 hits, the user was prompted with similar words that do exist.
for instance, if I search for "happyness", the result is:
You searched for: happyness
No results were found.
Did you mean... happiness?
The same thing would work for many verb tenses and grammatical endings, but it won't find synonyms :-)
To accomplish that, I used a spider to gather a lexicon of unique words used in the site. My progress was discussed here [webmasterworld.com]. Hint: put a UNIQUE index on the "word" column.
So when your search gets 0 results, you look in the database lexicon for similar words. the "similarity" is calculated using the levenshtein algorithm. Ties are broken using a "weight" which is the number of times the word was found.
good luck!