Forum Moderators: skibum

Message Too Old, No Replies

A Deceptively Simple SPAM Filter

Here's an insight on how one large company solves the spam problem

         

cyril kearney

3:06 pm on Jul 10, 2002 (gmt 0)

10+ Year Member



I was speaking with the email administrator at a very large company about spam. She gave me some ballpark numbers about the growth of spam at her site. Sexually related spam has risen about 200% in the last 12 months. This includes pornography, viagra and enlargement ads. Other spam has grown about 100%. Spam accounts for about 20% all email. The overall numbers were counting internal email and mail sent over the Internet.

Her observation was that the overwhelming number of spam messages was coming from non-US urls.

The highest level of filtering was on the url extensions. They were currently not doing any filtering of the US extensions at all. The foreign extensions were subjected to intense filtering. (Note: Virus checking was being done on everything. She was talking about content filtering.)

This was not because of a non-US bias on the part of this company but an experience drive reaction to the actual email being sent to her company.

Would this approach work at your site?

rcjordan

3:16 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>work

Absolutely!

Here's the code for mailwasher:
~~~~~
Delete if the 'from' field
contains RegExpr
(au¦be¦br¦ca¦ch¦cl¦cn¦cz¦de¦dk¦es¦fi¦fr¦gr¦hu¦jp¦is¦it¦lv¦nl¦pl¦mx¦no¦pt¦ro¦ru¦se¦sk¦su¦ua¦uk¦yu¦za)(¦.¦..¦[^.]..)$
~~~~~
(note "¦" is the pipe character)

Sinner_G

3:20 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>(au¦be¦br¦ca¦ch¦cl¦cn¦cz¦de¦dk¦es¦fi¦fr¦gr¦hu¦jp¦is¦it¦lv¦nl¦pl¦mx¦no¦pt¦ro¦ru¦se¦sk¦su¦ua¦uk¦yu¦za)(¦.¦..¦[^.]..)$

Does this automatically delete all emails from these countries?

chiyo

3:36 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Curiously, seeing our sites are all Asian content, the great majority of our spam comes from com domains and the US. Even though some is relayed through East Asian and Euro servers, US servers are still our major spammer by a long way.

I dont know why this is. It seems to be a different experience to many. Most of our spam I think comes from emails on our sites over the years. Maybe we are getting a different kind of spam.

We find using the DNS ban lists, and filtering for non Latin characters, and certain words in subjects is far more useful.

rcjordan

4:17 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Does this automatically delete all emails from these countries?

Mailwasher doesn't have an auto-delete function (I asked, but they felt it was too dangerous for the average user), but it does mark the "Delete" checkbox. All you have to do is click "Process" and it nukes them.

>filtering for non Latin characters
~~~~~
Delete if the 'Subject' field
contains RegExpr
¿¦À¦Á¦Ã¦Å¦Æ¦È¦É¦Ê¦Ë¦Ì¦Í¦Î¦Ï¦Ð¦Ñ¦Ò¦Ó¦Ô¦Õ¦Ö¦Ù¦Ú¦
~~~~~

gsx

4:33 pm on Jul 10, 2002 (gmt 0)

10+ Year Member



Surely most spam must come from hotmail.com and yahoo.com free email services?

rogerd

4:50 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Mailwasher actually does have an auto-delete function, sort of. You can auto-blacklist e-mail addresses that activate a filter, and in the filter definition you can click the "don't show" box. Thus, if a friend sends you a message with "viagra" or "mortgage" in the title (a couple of my filtered words), he/she could be automatically added to your blacklist and you would never see his/her mail again. (Serves 'em right... :))

I agree with GSX, although we do get spam from foreign domains, the bulk of it seems to come from free .com domains. Usually it comes from a unique alphanumeric user name to prevent banning by e-mail address.

RC, I guess you don't get much useful mail from overseas?

rcjordan

4:55 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>most

Not for me. I'm averaging about 100 spams per day (SPD ??) now. Yahoo.com probably is the single worst offender, but I'd estimate that it is less than 4% of the total. There is a second group of email account hosts (juno, excite, hotmail) that probably make up another 15% altogether. 20 or 30% is coming from the countries listed above.

>RC, I guess you don't get much useful mail from overseas?

I run 'positive' filters too. Those who need to get through usually do.

JamesR

5:10 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



rc, are you making a separate filter for each character or can you input that mass of non-latin characters or foreign domains all at once?

rogerd

7:52 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I have a question about the syntax, too, RC. As a test, I created a filter using the long string in your post, but it failed to flag a spam from a .jp address. Is there anything tricky I need to do? I did the From Field - Contains Reg Expr - and then the long string in the box. Thanks.

rcjordan

8:54 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>can you input that mass of non-latin characters or foreign domains all at once?

Yes, all at once. It only takes one Rule line. I'm not much at RegExpr, but the pipe character is "or" logic.

>syntax

The description of what you set up sounds correct. If you cut & paste from wmw, be sure that you fix the pipe character. Also, I notice the above long line of code may have wrapped, did you get all of it?

Finally, in my setup, I have the "Spam Country" filter very high in the process order. Some spammers are putting personalized code (your domain, email address, or whois name) in the subject... this is a way to trigger your friends list.

That's about all I can figure that might be wrong.

rogerd

9:30 pm on Jul 10, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Thanks, RC, it was the pipe character... looked OK on the screen, but it wasn't. :)