TheOptimizationIdiot - 4:42 pm on Feb 27, 2013 (gmt 0)
4. The method of claim 1, wherein identifying the document as a spam document, further comprises: responsive to the actual number of related phrases present in the document for at least one phrase exceeding the expected number of related phrases by at least a multiple of a standard deviation of the expected number of related phrases, identifying the document as a spam document.
Based on the preceding I doubt it's exactly "conform to the norm" since you have to be over by a multiple of the expected standard deviation to be considered spam. The quotes from earlier were to point out what is actually taken into account which is much more than most realize.
I think Zivush gives some good advice to people about reading the whole thing and I'd add reading all of the phrase based scoring patents can be enlightening.