Msg#: 4346964 posted 4:52 am on Aug 3, 2011 (gmt 0)
I've been asked to help a good friend with a spelling issue - his customers are leaving (or at best getting frustrated and phoning) due to poor spelling not being corrected (usually a misspelling leads to no results).
We've a lot of experience with spell checking for Yellow Pages type search but this is different to what we're used to. Thankfully I'm just giving suggestions rather than creating a full solution; but as usual with me I'm getting my hands dirty with the data :)
Do you/Did you have issues with spelling on your ecommerce site? How do you deal with it? Are there ready made solutions out there? (they do a wide variety of machinery parts so lots of jargon and strange but similar product codes).
A big problem is the sheer number of different "words" that can be correction candidates, millions of parts leads to tens of millions of strange character combinations.
Oh, and one other thing... they handle a lot of searches at a given time of day so speed of the solution is a consideration (to the point that if the spelling solution has to be on its own machine that'd be ok).
Msg#: 4346964 posted 3:01 pm on Aug 4, 2011 (gmt 0)
I had a similar problem and created a fuzzy search solution using an adaptation of the Damerau-Levenshtein algorithm. I have all words occuring in product names in a database and when someone types in a word into the searchbox the letters are compared with the words in the database and possible solutions are returned in order of Damerau-Levenshtein Distance. If a customer types in "bleu wodgat" instead of "blue widget" he still gets a result. The fuzzy search logic has considerably increased conversions on my website.
It works for me, since I only have a little more than thousand products in my database so the speed is acceptable. I also cache the search results, so if a second customer makes the same typo I do not need to run the algorithm.
If you have a high number of products or lack the expertise to build your own custom solution I would consider outsourcing search and use a third party solution like "Exorbyte". There you provide a feed of all your products and the search is completly processed on a third party server.
Msg#: 4346964 posted 5:24 pm on Aug 4, 2011 (gmt 0)
A quick and dirty method...
I have a custom field for my products where I keep common misspellings. This custom field is searchable through my ecommerce search field.
then like every month I look through the search log records and see if there are any common misspellings that my customers are typing in. If so, then I add those to the custom field of the appropriate products.
I don't have a lot of products, so I should warn you that if you have LOTS of products this can get unwieldy pretty quickly.
Msg#: 4346964 posted 1:09 am on Aug 5, 2011 (gmt 0)
Thanks, levenshtein is indeed handy in many scenarios - the issue here is the enormous number of "words" (as he wants to spell check product codes too). Looks as though compromises are going to have to happen...
Msg#: 4346964 posted 1:15 am on Sep 28, 2011 (gmt 0)
There are two possible approaches on spelling issues in search: - soundex (you have to build an algorithm based on soundex tables which compare and rank various misspellings of the same sound for each languages). Soundex works well with English names and words but if you have a lot of foreign brands or words like "Lamborghini", then problems start. - Levenshtein works much better for all languages and proper names but good luck running Levenshtein on top of a database of more than a few hundred products in a timely manner. It's just a huge number of operations, queries and lookups. If you want to do Levenshtein on large product catalogs, you might want to check Exorbyte Commerce. That's the core of their system and they are affordable compared to building this on your own. They build an in-memory index of you catalog on the fly using product feeds.