Forum Moderators: coopster

Message Too Old, No Replies

Spelling suggestions for wrong spelling

         

knkk

12:01 pm on Jun 21, 2009 (gmt 0)

10+ Year Member



Hi,

I came across this perl script someone wrote for spelling suggestions when someone types a wrong spelling. Can someone please convert it to PHP? It will help a lot of people.

Source: <snip>

import re, collections

def words(text): return re.findall('[a-z]+', text.lower())

def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model

NWORDS = train(words(file('big.txt').read()))

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [a + b[1:] for a, b in s if b]
transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
replaces = [a + c + b[1:] for a, b in s for c in alphabet if b]
inserts = [a + c + b for a, b in s for c in alphabet]
return set(deletes + transposes + replaces + inserts)

def known_edits2(word):
return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

def known(words): return set(w for w in words if w in NWORDS)

def correct(word):
candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
return max(candidates, key=NWORDS.get)

[edited by: eelixduppy at 12:46 am (utc) on June 22, 2009]
[edit reason] No personal urls, thanks. [/edit]

knkk

3:15 pm on Jun 21, 2009 (gmt 0)

10+ Year Member



Just in case that looks like I am trying to get some kind-hearted people to do my work for me :), no, it's just that I have no idea of Perl at all (even if I am okay at PHP), and do not have enough time to learn now :(. I thank anyone for any help.

coopster

12:36 pm on Jun 22, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Why not just use the functions built in to PHP for spell checking?
Pspell [php.net]

jatar_k

1:46 pm on Jun 22, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Coop's suggestion is the one but for the sake of example you could also use things like

levenshtein() [ca3.php.net]
similar_text() [ca3.php.net]
metaphone() [ca3.php.net]
soundex() [ca3.php.net]

knkk

1:54 pm on Jun 22, 2009 (gmt 0)

10+ Year Member



thanks, coopster and jatar_k! however, how do i recommend correct spellings for typos specific to my site, that are not english words? i am most concerned about someone spelling dilsukhnagar as dilshuk nagar (a locality in the city i stay in - my site is a local search site / local guide, and searches a mysql database where location names are entered with standard spellings). The solutions above seem to work only for words in the English dictionary.

jatar_k

2:07 pm on Jun 22, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



for pspell it does have settings, you can load specific dictionaries, I only looked through, you would have to get into it a little more. I think you can have specific word lists as well.