Welcome to WebmasterWorld Guest from 18.210.22.132

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

Can NMS FormMail be modified to deny submissions with HTML in Comments

I want to refuse submissions from form spammers using links in the textarea

     
7:40 pm on May 16, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts: 319
votes: 0


Hi guys. I use NMS FormMail 3.14c, renamed and aliased in .htaccess, to receive visitor input. I have the feedback form validated via JavaScript, on the contact page. The validation works fine when you use my actual form in a JavaScript enabled browser and try to type something I filter out(html and bb tags). However, I have been getting some submissions that contain html and/or bulletin board type url tags, with links to counterfeit drugs and other spam destinations. This tells me that the spammer is using using a non-JavaScript, or JavaScript disabled browser to perform the submission, nullifying my efforts at stripping out links from the textarea (Comments).

Does anybody on this (Perl) forum have experience with modifying NMS FormMail to include a section that detects unwanted tags and refuses to process the submission until they are removed by the submitter? As it is now they get a success page when bypassing my JavaScript controls and I have to manually delete the email containing the spam submission.

Here are examples of the type of textarea comments I want to filter out, or deny submissions to:

<a href="http://www.example.com/">Spam Description</a>
[url=http://www.example.com/]Spam Description[/url]
[url="http://www.example.com/"]Spam Description[/url]
[link=http://www.example.com/]Spam Description[/link]

Can these items be detected in FormMail and the submission refused, with a message stating we don't allow links in the comments field?

[edited by: Wizcrafts at 7:41 pm (utc) on May 16, 2008]

[edited by: phranque at 6:45 am (utc) on June 7, 2008]
[edit reason] fix formatting [/edit]

9:02 am on May 18, 2008 (gmt 0)

Junior Member

10+ Year Member

joined:May 8, 2008
posts: 74
votes: 0

Simple one:
if ($text=~/<a href=¦\[url=¦\[link=/) {
print "please do not post HTML or bb-type links in 'Message' field.
exit;
}

(Of course, replace $text with correct variable)

4:05 pm on May 18, 2008 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 2, 2005
posts:70
votes: 0


I'd add .info , .ru and .cn to that regex.
dot info is a cheap domain, .ru is russia and .cn is china.

A modified version of chorny's regex below.

if ($text=~/<a href=¦\[url=¦\[link=¦.info¦.cn¦.ru/) {
print "please do not post HTML or bb-type links in 'Message' field.
exit;
}

4:41 pm on May 18, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Ok, you guys have provided me with the means to block unwanted content, but do you know where in the 81 kb NMS FormMail.pl file I would insert such codes? I am not a Perl programmer in the slightest stretch of the imagination.

I will hazard a guess. Would it go into the "required" fields section, where "realname" and "email" are checked, along with user defined required fields?

1:25 pm on June 2, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 9, 2003
posts:503
votes: 0


I would love an answer to this. I've been having the same problems and I've tried putting both everywhere I could think of but I always end up with a server 500 error.

:(

2:42 pm on June 2, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Simonuk;
I also got 500 errors when I tried adding the HTML stripping code mentioned above to NMS FormMail 3.14c. I have given up for the time being, until I can find a Perl guru to figure out where in the 81000 bytes of code one would insert validation routines to strip out HTML and particular drug words from form submissions. I don't know much at all about Perl coding; I am just an end user trying to block feedback form spammers from submitting junk links to me and my clientele.
4:32 pm on June 3, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member rocknbil is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 28, 2004
posts:7999
votes: 0


Three things to look at:

1. The sample provided has an ERROR, it is missing an ending quote and semicolon, which will generate a 500:

if ($text=~/<a href=¦\[url=¦\[link=¦.info¦.cn¦.ru/) {
print "please do not post HTML or bb-type links in 'Message' field.";
exit 0;
}

2. The software on this message board munges up the logical OR. This is supposed to be a full vertical pipe, the character you get when you hold the shift key and press the backslash on your keyboard. -> \. This character - > ¦ will give you a 500 error if used as presented.

3. You always, always ALWAYS need to print a content-type header BEFORE printing anything else:

print "content-type: text/html\n\n";

printing anything at all before a content type header is generated will always 500.

Note TWO newlines. If this filter comes before the content-type, then simply add it to the "if." Heck, go ahead and do that anyway, makes it portable and will only be output if the regexp is found:

if ($text=~/<a href=¦\[url=¦\[link=¦.info¦.cn¦.ru/) {
print "content-type: text/html\n\n";
print "please do not post HTML or bb-type links in \&quot\;Message\&quot\; field.";
exit 0;
}

Another caveat about this approach: I have seen encoded attempts in my logs, that is, instead of [, %5b:

%5b%3dhttp://spamboy.com....

Furthermore, this only clears one field ($text.) Spamming by command-line or using a 'bot can put this into ANY field. You need to do it to all input fields accepted by your program.

Here is the routine I use and it will trap ALL input variables containing nasty crap. In addition, there is NO REASON for any form to have the following in it:

to: <- as in" to:spam_target@spam-me.com"

cc/bcc: This is super-plus-bad mojo, if they can stick a bcc header in your mail, it will send out thousands of emails and you get . . . one. I know - "my forms have no BCC" - but they send an octal newline in the subject field and poof - create their own BCC.

content\-type: this is an attempt to create a multi-part MIME mail using your form processor. So basically, it lets your original message fly but adds a second mail to it that does the dirty work.

I use it a little differently, doing logging and what not, but it will work for this. This assumes all input is in the hash %data, change %data to the variable you use for input:


foreach $v (keys %data) {
if (
($data{$v} =~ /b*cc\s*:/i) or
($data{$v} =~ /to\s*:/i) or
($data{$v} =~ /content\-type/i) or
($data{$v} =~ /\[\s*URL.*\]*/i) or
($data{$v} =~ /\[\s*LINK.*\]*/i) or
($data{$v} =~ /\%5B\s*URL.*(\%5D)*/i) or
($data{$v} =~ /\%5B\s*LINK.*(\%5D)*/i) or
($data{$v} =~ /\<\s*a\s*href.*\>*/i)) or
($data{$v} =~ /\%3C\s*a\s*href.*(\%3E)*/i)
) {
print "content-type: text/html\n\n";
print "No email for you. Action logged.";
exit 0;
}
}

Note also I've craftily used "or" instead of the logical ¦¦, they are synonymous.

This code is not tested AS PRESENTED HERE, if it errors on you, post back. I'll have a look. :-) It's a copy/paste from one of my proggies and should work. Also, because it generates it's own content-type, you should be able to put it anywhere in the program after the read/parse. You will recognize read/parse by this:

%data = &readParse;

Where "readParse" is the name of the subroutine that un-encodes the input and stores it in %data.

I'd add .info , .ru and .cn to that regex.

IMO this is basically useless - anything they submit is likely to be fake anyway, and you may wind up disallowing a legitimate contact.

ASIDE: this goes against one of my basic self-imposed precepts - instead of ONLY ALLOWING what you expect, it attempts to DISALLOW bad data - which is a never-ending tail chase trying to stay ahead of hackers. In the case of textual input, I don't see another way - but is fortified by years of logging ALL DATA input by forms. Any better ideas I'm all ears.

5:27 pm on June 3, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Rocknbil;
Thanks for your impressive input! I am now reading through 3300+ lines of code to try to find the place that corresponds to what you refer to as read/parse, so I can drop in your subroutine. I suspect that the place I am looking for will be in the lower third of the NMS FormMail script.

I'll report back here after I (1) find where to insert the subroutine, and (2) test it.

I already know about how this forum substitutes a broken pipe for a solid one, from posing on the Apache Server and Search Engine Identification forums. Thanks for pointing out the missing " and correcting the rest of the sample codes.

7:10 pm on June 3, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Rocknbil;
Is this what you are referring to as my point of entry (line 2256)?


=item parse_form ()

Parses the HTML form, storing the results in various fields in the
C<FormMail> object, as follows:

=over

=item C<FormConfig>

A hash holding the values of the configuration inputs, such as
C<recipient> and C<subject>.

=item C<Form>

A hash holding the values of inputs other than configuration inputs.

(snip)

=cut

sub parse_form {
my ($self) = @_;

$self->{FormConfig} = { map {$_=>''} $self->configuration_form_fields };
$self->{Field_Order} = [];
$self->{Form} = {};

I've read the WebmasterWorld TOS and I think I am permitted to include the URL to the open source NMS FormMail [nms-cgi.sourceforge.net] script, in case anybody wants to examine the script for a point of entry and validation edits.

[edited by: Wizcrafts at 7:30 pm (utc) on June 3, 2008]

4:40 pm on June 4, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member rocknbil is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 28, 2004
posts:7999
votes: 0


Yes, it appears parse_form () is the sub that does the parsing.

Okay I've had a look. I find this script a of an overkill for the task (it's 2004,) but this should work. UNTESTED!

Take the WHOLE sub parse_form and COPY it. Then rename the original:

parse_form_old {
.....
}

Preserving the original in case you mess something up.

Now paste the copied sub in, it can go anywhere as long as it's OUTSIDE any other sub. In the above example,

sub parse_form_old {
.....
}

sub parse_form {
.....
}

Your new sub should look like this. A copy/paste should work:


sub parse_form {
my ($self) = @_;
$self->{FormConfig} = { map {$_=>''} $self->configuration_form_fields };
$self->{Field_Order} = [];
$self->{Form} = {};
foreach my $p ($self->cgi_object->param()) {
## added code
if (
($self->{FormConfig}{$p} =~ /b*cc\s*:/i) or
($self->{FormConfig}{$p} =~ /to\s*:/i) or
($self->{FormConfig}{$p} =~ /content\-type/i) or
($self->{FormConfig}{$p} =~ /\[\s*URL.*\]*/i) or
($self->{FormConfig}{$p} =~ /\[\s*LINK.*\]*/i) or
($self->{FormConfig}{$p} =~ /\%5B\s*URL.*(\%5D)*/i) or
($self->{FormConfig}{$p} =~ /\%5B\s*LINK.*(\%5D)*/i) or
($self->{FormConfig}{$p} =~ /\<\s*a\s*href.*\>*/i)) or
($self->{FormConfig}{$p} =~ /\%3C\s*a\s*href.*(\%3E)*/i)
) {
print "content-type: text/html\n\n";
print "No email for you. Action logged.";
last;
exit 0;
}
## end added code
if (exists $self->{FormConfig}{$p}) {
$self->parse_config_form_input($p);
}
else {
$self->parse_nonconfig_form_input($p);
}
}
$self->substitute_forced_config_values;
$self->expand_list_config_items;
$self->sort_field_order;
$self->remove_blank_fields;
}

the last; is not really necessary because it exits directly. Last breaks out of the foreach loop when the condition is encountered, it's just . . a habit . . .

While this should work, in a brief look-over of the script a more graceful method would be to use one of the many print methods throughout this program instead of my print-and-exit. For example, there's no logging action here even though the message says so. :-)

Although it's pretty thorough, I can't figure out why they don't have methods of just adding disallowed strings or characters to the config, they've done just about everything else. Maybe it does, just don't have time to find it.

Again, I haven't tested this - make back up copies and see if it flies for you.

5:57 pm on June 4, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Rocknbil;
Thank you so much for going to all this trouble on my behalf! I have to go out for a while, but I will apply your codes later today and report back on my findings.

As regards the overkill in the code, you can thank the London Perl Mongers, headed by Dave Cross, who is an active member of the Perl community and the founder of the London Perl Mongers. They went out of their way to make FormMail much more secure than Matt Wright's versions.

Later.

3:04 am on June 5, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Rocknbil;
Right now your additions are causing server 500 errors. I am troubleshooting line by line and will report what caused the errors when I find the culprit.
4:41 am on June 5, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


I found the line of code that was causing the server 500 error. It was this:

($self->{FormConfig}{$p} =~ /\<\s*a\s*href.*\>*/i)) or


Notice that there is only one opening parenthesis, but two closing parenthesis? The second one is unmatched and removing it fixed the 500 error.

Corrected code line:
($self->{FormConfig}{$p} =~ /\<\s*a\s*href.*\>*/i) or

Unfortunately, the codes failed to block a test submission loaded with html and BB code tags, so I am continuing to work on it. The test comments were submitted instead of getting refused.

:ResumeTesting

:Wiz

[edited by: Wizcrafts at 4:43 am (utc) on June 5, 2008]

4:09 pm on June 5, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member rocknbil is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 28, 2004
posts:7999
votes: 0


oops. There was a script-specific condition I eliminated. :-/

Well, I hope it works for you - That should be the right place, and it works in my scripts, but this one is a bit convoluted and I may have it wrong. You can see if the form vars are actually populating by dropping in a content-type, print, and exit, like so:


sub parse_form {
my ($self) = @_;
$self->{FormConfig} = { map {$_=>''} $self->configuration_form_fields };
$self->{Field_Order} = [];
$self->{Form} = {};
print "content-type: text/html\n\n";
foreach my $p ($self->cgi_object->param()) {
print "key: $p val: $self->{FormConfig}{$p} <br>\n";
..........
}
exit 0;

content type BEFORE the foreach, exit after, and in the loop print both the key and value.

4:02 am on June 6, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


Rocknbil;
I added your latest codes and have to report that no variables were contained in any input field, even though I filled in the form completely. I guess that means we are inserting your codes into the wrong spot, or they are being overwritten by a subroutine further down the script.

I'll keep at it until I get it to work. Then I'll report back so others can gain from this journey.

9:29 am on July 11, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 9, 2003
posts:503
votes: 0


I was so hoping this would work. I did manage to get it so there were no errors but it was still letting through all forms with http, www etc.

Never knew this would be so difficult...

2:00 pm on July 11, 2008 (gmt 0)

Full Member

10+ Year Member

joined:May 5, 2003
posts:319
votes: 0


simonuk;
I have not been able to get this validation routine to work in NMS FormMail. I guess we might have to write validation codes specific to the way the London Perl Mongers have coded their work.

In the meanwhile, here is what I have come up with as a stop-gap measure.

I created a JavaScript include file that uses a function to write a line of text containing a form input field with a checkbox and a field name. This ID has been designated as a "required field." People with JavaScript enabled will see the line of text and will check the box to agree to my brief terms before submitting, or their submission will fail. Visitors or bots without JavaScript will never see that input field and since it is required, their submission will fail.

If a JavaScript enabled spammer does type or paste in comments that include spam words, or links and URLs, the JavaScript validation routine will strip out all of the text in the comments area and other fields being validated. The form warns submitters of this both above and below the comments area.

Since implementing that simple measure I have not had one single spam submission from the contact page in question. That's probably because it isn't worth "their" while to manually examine my form page for JavaScript includes, to send spam they now realize nobody will see or post. Those with JavaScript enabled who insert spam links will see them disappear and be replaced with a sentence notifying them that the terms of submission forbid those items. This is also a deterrent.

I still hope to insert validation into the Perl Script itself. This would allow submissions from people who block JavaScript for their own security, via browser add-ons.

[edited by: phranque at 7:03 am (utc) on July 14, 2008]
[edit reason] see WebmasterWorld Mission Statement [webmasterworld.com] [/edit]