Forum Moderators: not2easy
For instance, I'm writing very targeted emails to law firms that might be interested in a specialized legal search engine. Several times the phrase "search engine" has triggered a spam filter. Who knows how many other messages got blocked that I never heard about.
Does anyone know of any other common words or phrases that might cause problems?
Does anyone know of any other common words or phrases that might cause problems?
Ack! I've been fortunate enough to be able to view one of the 2.5MB text files (Spam Tables) that contain the terms that are part of our spam filters. It would be close to impossible to determine what words would trip a filter (besides the more common ones).
It is not the word itself, but also how many times that word appears. There is also quite a bit of comparison going on with surrounding text, email from address, subject, etc.
Here is a small sampling of what lies within the Spam Table used with our software...
specializing,10264,39,61
graciously,583326,62,2
bringing,616976,399,975
mbit,1174272,16,0
mbps,1633464,32,10
expiration,3319151,247,143
exports,3327519,90,8
windowsnt,3735177,46,0
matching,3743773,464,869 Not something I'd want to try and decipher! ;)
[edited by: rogerd at 9:11 pm (utc) on Mar. 2, 2004]
[edit reason] one spam word removed [/edit]
wilsonweb.com/wmt8/spamfilter_phrases.htm
spam.surferbeware.com/spam-spam-filter.htm
Good spam filtering evolves rapidly, of course, to catch the latest wave of spam at any point in time. Just about any adult words can trip filters, too.
Sometimes you can get tripped up in surprising ways. I found some client orders were being flagged as porn by my filter - I couldn't see why, at first, but then I noticed the orders were for merchandise related to the University of South Carolina [uscsports.ocsn.com]. ;) Even word stems and fragments may trigger some filters.
Or, excessive spaces in the subject.
On my own eMail, my spam filters will reject messages with my first name in the subject. The only messages I've seen with my name in the subject have all been spam. I've never seen a legitimate message with my first name in the subject. So, I wrote the rule and reject those messages.
If a message has "ad" or "adv" in the subject, boom, gone.
Messages sent by mailing programs that do not comply with Internet standards get rejected.
Also, don't forget that some spam filters will ignore non-alphanumeric characters, and might find a pattern match in something that in and of itself would not be a trigger. For example, if a filter is set to reject "WAREZ", it might be triggered by "Your child will love our animal software. Zebras, elephants and more can be found in our interactive safari." where the match would be "...animal softWARE. Zebras..."