SincerelySandy - 3:00 pm on Feb 17, 2007 (gmt 0)
The classification for possible phrases as either a good phrase or a bad phrase is when the possible phrase; "appears in a minimum number of documents, and appear a minimum number of instances in the document collection
I thought that a "good phrase" was classified as a phrase that could be used to predict the occurance of other phrases. You seem to be saying that a "good phrase" is determined by the number of times that phrase appears in various places. Am I understanding you correctly?
a BAD phrase is not one with dirty words, it is simply a phrase with too low a frequency count to make the "good" list.
I thought that a "bad phrase" was simply one that could not be used to predict the occurance of other phrases?