Welcome to WebmasterWorld Guest from 54.144.44.9

Forum Moderators: buckworks

Message Too Old, No Replies

Spelling Mistakes In Product Search

How do you handle it?

     
4:52 am on Aug 3, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 9, 2004
posts:1435
votes: 0


I've been asked to help a good friend with a spelling issue - his customers are leaving (or at best getting frustrated and phoning) due to poor spelling not being corrected (usually a misspelling leads to no results).

We've a lot of experience with spell checking for Yellow Pages type search but this is different to what we're used to. Thankfully I'm just giving suggestions rather than creating a full solution; but as usual with me I'm getting my hands dirty with the data :)

Do you/Did you have issues with spelling on your ecommerce site? How do you deal with it? Are there ready made solutions out there? (they do a wide variety of machinery parts so lots of jargon and strange but similar product codes).

A big problem is the sheer number of different "words" that can be correction candidates, millions of parts leads to tens of millions of strange character combinations.

Oh, and one other thing... they handle a lot of searches at a given time of day so speed of the solution is a consideration (to the point that if the spelling solution has to be on its own machine that'd be ok).
5:53 pm on Aug 3, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 6, 2005
posts:670
votes: 0


We use SQL Server Full-Text Search, which is free with SQL Server Express.

The issue isn't just misspellings, it's also plurals and word-ordering. You may sell a:

"Large Blue Widget",

...but if a customer searches for:

"Large Blue Widgets", or
"Large Widgets", or
"Blue Large Widget"

... your search *must* return your "Large Blue Widget" in the results. A simple SQL LIKE search will fail with these three examples.

Give SQL Server Full-Text Search a try.
3:01 pm on Aug 4, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 8, 2003
posts:1141
votes: 0


I had a similar problem and created a fuzzy search solution using an adaptation of the Damerau-Levenshtein algorithm. I have all words occuring in product names in a database and when someone types in a word into the searchbox the letters are compared with the words in the database and possible solutions are returned in order of Damerau-Levenshtein Distance. If a customer types in "bleu wodgat" instead of "blue widget" he still gets a result. The fuzzy search logic has considerably increased conversions on my website.

It works for me, since I only have a little more than thousand products in my database so the speed is acceptable. I also cache the search results, so if a second customer makes the same typo I do not need to run the algorithm.

If you have a high number of products or lack the expertise to build your own custom solution I would consider outsourcing search and use a third party solution like "Exorbyte". There you provide a feed of all your products and the search is completly processed on a third party server.
5:24 pm on Aug 4, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member planet13 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 16, 2010
posts:3813
votes: 29


A quick and dirty method...

I have a custom field for my products where I keep common misspellings. This custom field is searchable through my ecommerce search field.

then like every month I look through the search log records and see if there are any common misspellings that my customers are typing in. If so, then I add those to the custom field of the appropriate products.

I don't have a lot of products, so I should warn you that if you have LOTS of products this can get unwieldy pretty quickly.
1:09 am on Aug 5, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 9, 2004
posts:1435
votes: 0


Thanks, levenshtein is indeed handy in many scenarios - the issue here is the enormous number of "words" (as he wants to spell check product codes too). Looks as though compromises are going to have to happen...
1:15 am on Sept 28, 2011 (gmt 0)

New User

joined:Sept 27, 2011
posts: 4
votes: 0


There are two possible approaches on spelling issues in search:
- soundex (you have to build an algorithm based on soundex tables which compare and rank various misspellings of the same sound for each languages). Soundex works well with English names and words but if you have a lot of foreign brands or words like "Lamborghini", then problems start.
- Levenshtein works much better for all languages and proper names but good luck running Levenshtein on top of a database of more than a few hundred products in a timely manner. It's just a huge number of operations, queries and lookups. If you want to do Levenshtein on large product catalogs, you might want to check Exorbyte Commerce. That's the core of their system and they are affordable compared to building this on your own. They build an in-memory index of you catalog on the fly using product feeds.

Cheers,
Dan