| 5:53 pm on Aug 3, 2011 (gmt 0)|
We use SQL Server Full-Text Search, which is free with SQL Server Express.
The issue isn't just misspellings, it's also plurals and word-ordering. You may sell a:
"Large Blue Widget",
...but if a customer searches for:
"Large Blue Widgets", or
"Large Widgets", or
"Blue Large Widget"
... your search *must* return your "Large Blue Widget" in the results. A simple SQL LIKE search will fail with these three examples.
Give SQL Server Full-Text Search a try.
| 3:01 pm on Aug 4, 2011 (gmt 0)|
I had a similar problem and created a fuzzy search solution using an adaptation of the Damerau-Levenshtein algorithm. I have all words occuring in product names in a database and when someone types in a word into the searchbox the letters are compared with the words in the database and possible solutions are returned in order of Damerau-Levenshtein Distance. If a customer types in "bleu wodgat" instead of "blue widget" he still gets a result. The fuzzy search logic has considerably increased conversions on my website.
It works for me, since I only have a little more than thousand products in my database so the speed is acceptable. I also cache the search results, so if a second customer makes the same typo I do not need to run the algorithm.
If you have a high number of products or lack the expertise to build your own custom solution I would consider outsourcing search and use a third party solution like "Exorbyte". There you provide a feed of all your products and the search is completly processed on a third party server.
| 5:24 pm on Aug 4, 2011 (gmt 0)|
A quick and dirty method...
I have a custom field for my products where I keep common misspellings. This custom field is searchable through my ecommerce search field.
then like every month I look through the search log records and see if there are any common misspellings that my customers are typing in. If so, then I add those to the custom field of the appropriate products.
I don't have a lot of products, so I should warn you that if you have LOTS of products this can get unwieldy pretty quickly.
| 1:09 am on Aug 5, 2011 (gmt 0)|
Thanks, levenshtein is indeed handy in many scenarios - the issue here is the enormous number of "words" (as he wants to spell check product codes too). Looks as though compromises are going to have to happen...
| 1:15 am on Sep 28, 2011 (gmt 0)|
There are two possible approaches on spelling issues in search:
- soundex (you have to build an algorithm based on soundex tables which compare and rank various misspellings of the same sound for each languages). Soundex works well with English names and words but if you have a lot of foreign brands or words like "Lamborghini", then problems start.
- Levenshtein works much better for all languages and proper names but good luck running Levenshtein on top of a database of more than a few hundred products in a timely manner. It's just a huge number of operations, queries and lookups. If you want to do Levenshtein on large product catalogs, you might want to check Exorbyte Commerce. That's the core of their system and they are affordable compared to building this on your own. They build an in-memory index of you catalog on the fly using product feeds.