Forum Moderators: coopster & phranque

Message Too Old, No Replies

Reading text files to send email messages using PERL

I am new to perl so be kind

         

myname182

5:54 pm on Jan 21, 2003 (gmt 0)

10+ Year Member



Purpose: We implemented new SMTP scrubbers that have SPAM filtering. We want to benchmark the SPAM blocking by flooding one of our email addresses with SPAM and count how many make it by our scrubbers. I found out about SpamArchive.net. They have a big archive of SPAM emails that I can use to benchmark.

Problem: These are all in text format. So what I am trying to do now is write a PERL script that will read these text files (find the subject, sender, and body) and send out the email. Here is how one of the files looks like. I kind of cleaned it up, but this is the basic format:

from: "FreeCard" <nobody@mail.com>
sender: XMission Spam <spammy@maission.com>
x-spam-level: **
subject: What women really want from men ...... Credit cards
x-loop-detect: 1
to: submit@spamarchive.org
x-spam-status: No, hits=2.8 required=8.0
tests=CLICK_BELOW,CTYPE_JUST_HTML,INVALID_MSGID,MSGID_HAS_NO_AT,
SPAM_PHRASE_05_08,WEB_BUGS
version=2.43
date: 23 Nov 2002 11:29:24 -0000
reply-to: nobody@mail.com
message-id: MID-20736-1245379
content-type: text/html; charset="us-ascii"
mime-version: 1.0

<HTML>

<HEAD>

<TITLE>10-23</TITLE>

</HEAD>

<div align="center">
<table border="0" width="450">
<tr>
<td><table width=100%>
<tr>
<td>
</td>
</tr>
<tr>
<td>
<img border="0" src="http://content.net/2.gif" width="463" height="77">
</td>
</tr>
<tr>
<td>
</td>
</tr>
</table></td>
</tr>
</table>
</div>
************MORE HTML AFTER HERE, BUT I REMOVED IT*****

I noticed that in all the text files, after the "mime-version: 1.0" the body of the email follows. All I need help with is how to cut out only the body, only the email address in the "from: "FreeCard" <nobody@mail.com>" , and only the subject from the "subject: What women really want from men ...... Credit cards". I know this needs to be done with regular expressions, but I don't know how.

Thanks for the help.

BTW: Sorry about the lenght

jatar_k

10:21 pm on Jan 22, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Welcome to WebmasterWorld myname182,

I am not the best to answer PERL questions but this little bump should help the folk see your post. :)

andreasfriedrich

10:47 pm on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Let me welcome you to WebmasterWorld [webmasterworld.com] as well, myname182. Jatar_k was right that bumping your post would help (his post count) ;).

To get you started around here I suggest you read Marcia´s WebmasterWorld Welcome and Guide to the Basics [webmasterworld.com] post.

This little script will do what you want. Of course you would need to make some changes so that it does not just print the values back out. To run the script call it like this:

./mp.pl < spam_emails.txt

Andreas

andreasfriedrich

10:48 pm on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



#!/usr/bin/perl -w 
#
use strict;
#
while(<>) {
if (/^from:/i .. /^$/) {
next unless /^(from¦subject):\s*(.*)$/i;
print "$1: $2\n";
} else {
print "body: $_";
}
}
Andreas
Note: The WebmasterWorld posting software deletes spaces preceding the exclamation point "!" character. It also replaces a solid vertical pipe symbol with a broken vertical pipe "¦" symbol. Both of these changes will need to be undone in any code you copy from WebmasterWorld. Make sure to include a space preceding the "!" in mod_rewrite code, and always replace "¦" with a solid vertical pipe.

andreasfriedrich

4:08 pm on Jan 26, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Did that work out for you myname182?

myname182

6:54 pm on Jan 27, 2003 (gmt 0)

10+ Year Member



YES! THANK YOU! Everyone is so nice here!