Does anybody on this (Perl) forum have experience with modifying NMS FormMail to include a section that detects unwanted tags and refuses to process the submission until they are removed by the submitter? As it is now they get a success page when bypassing my JavaScript controls and I have to manually delete the email containing the spam submission.
Here are examples of the type of textarea comments I want to filter out, or deny submissions to:
<a href="http://www.example.com/">Spam Description</a>
[url=http://www.example.com/]Spam Description[/url]
[url="http://www.example.com/"]Spam Description[/url]
[link=http://www.example.com/]Spam Description[/link]
Can these items be detected in FormMail and the submission refused, with a message stating we don't allow links in the comments field?
[edited by: Wizcrafts at 7:41 pm (utc) on May 16, 2008]
[edited by: phranque at 6:45 am (utc) on June 7, 2008]
[edit reason] fix formatting [/edit]
I will hazard a guess. Would it go into the "required" fields section, where "realname" and "email" are checked, along with user defined required fields?
1. The sample provided has an ERROR, it is missing an ending quote and semicolon, which will generate a 500:
if ($text=~/<a href=¦\[url=¦\[link=¦.info¦.cn¦.ru/) {
print "please do not post HTML or bb-type links in 'Message' field.";
exit 0;
}
2. The software on this message board munges up the logical OR. This is supposed to be a full vertical pipe, the character you get when you hold the shift key and press the backslash on your keyboard. -> \. This character - > ¦ will give you a 500 error if used as presented.
3. You always, always ALWAYS need to print a content-type header BEFORE printing anything else:
print "content-type: text/html\n\n";
printing anything at all before a content type header is generated will always 500.
Note TWO newlines. If this filter comes before the content-type, then simply add it to the "if." Heck, go ahead and do that anyway, makes it portable and will only be output if the regexp is found:
if ($text=~/<a href=¦\[url=¦\[link=¦.info¦.cn¦.ru/) {
print "content-type: text/html\n\n";
print "please do not post HTML or bb-type links in \"\;Message\"\; field.";
exit 0;
}
Another caveat about this approach: I have seen encoded attempts in my logs, that is, instead of [, %5b:
%5b%3dhttp://spamboy.com....
Furthermore, this only clears one field ($text.) Spamming by command-line or using a 'bot can put this into ANY field. You need to do it to all input fields accepted by your program.
Here is the routine I use and it will trap ALL input variables containing nasty crap. In addition, there is NO REASON for any form to have the following in it:
to: <- as in" to:spam_target@spam-me.com"
cc/bcc: This is super-plus-bad mojo, if they can stick a bcc header in your mail, it will send out thousands of emails and you get . . . one. I know - "my forms have no BCC" - but they send an octal newline in the subject field and poof - create their own BCC.
content\-type: this is an attempt to create a multi-part MIME mail using your form processor. So basically, it lets your original message fly but adds a second mail to it that does the dirty work.
I use it a little differently, doing logging and what not, but it will work for this. This assumes all input is in the hash %data, change %data to the variable you use for input:
foreach $v (keys %data) {
if (
($data{$v} =~ /b*cc\s*:/i) or
($data{$v} =~ /to\s*:/i) or
($data{$v} =~ /content\-type/i) or
($data{$v} =~ /\[\s*URL.*\]*/i) or
($data{$v} =~ /\[\s*LINK.*\]*/i) or
($data{$v} =~ /\%5B\s*URL.*(\%5D)*/i) or
($data{$v} =~ /\%5B\s*LINK.*(\%5D)*/i) or
($data{$v} =~ /\<\s*a\s*href.*\>*/i)) or
($data{$v} =~ /\%3C\s*a\s*href.*(\%3E)*/i)
) {
print "content-type: text/html\n\n";
print "No email for you. Action logged.";
exit 0;
}
}
Note also I've craftily used "or" instead of the logical ¦¦, they are synonymous.
This code is not tested AS PRESENTED HERE, if it errors on you, post back. I'll have a look. :-) It's a copy/paste from one of my proggies and should work. Also, because it generates it's own content-type, you should be able to put it anywhere in the program after the read/parse. You will recognize read/parse by this:
%data = &readParse;
Where "readParse" is the name of the subroutine that un-encodes the input and stores it in %data.
I'd add .info , .ru and .cn to that regex.
IMO this is basically useless - anything they submit is likely to be fake anyway, and you may wind up disallowing a legitimate contact.
ASIDE: this goes against one of my basic self-imposed precepts - instead of ONLY ALLOWING what you expect, it attempts to DISALLOW bad data - which is a never-ending tail chase trying to stay ahead of hackers. In the case of textual input, I don't see another way - but is fortified by years of logging ALL DATA input by forms. Any better ideas I'm all ears.
I'll report back here after I (1) find where to insert the subroutine, and (2) test it.
I already know about how this forum substitutes a broken pipe for a solid one, from posing on the Apache Server and Search Engine Identification forums. Thanks for pointing out the missing " and correcting the rest of the sample codes.
=item parse_form ()Parses the HTML form, storing the results in various fields in the
C<FormMail> object, as follows:=over
=item C<FormConfig>
A hash holding the values of the configuration inputs, such as
C<recipient> and C<subject>.=item C<Form>
A hash holding the values of inputs other than configuration inputs.
(snip)
=cut
sub parse_form {
my ($self) = @_;$self->{FormConfig} = { map {$_=>''} $self->configuration_form_fields };
$self->{Field_Order} = [];
$self->{Form} = {};
I've read the WebmasterWorld TOS and I think I am permitted to include the URL to the open source NMS FormMail [nms-cgi.sourceforge.net] script, in case anybody wants to examine the script for a point of entry and validation edits.
[edited by: Wizcrafts at 7:30 pm (utc) on June 3, 2008]
Okay I've had a look. I find this script a of an overkill for the task (it's 2004,) but this should work. UNTESTED!
Take the WHOLE sub parse_form and COPY it. Then rename the original:
parse_form_old {
.....
}
Preserving the original in case you mess something up.
Now paste the copied sub in, it can go anywhere as long as it's OUTSIDE any other sub. In the above example,
sub parse_form_old {
.....
}
sub parse_form {
.....
}
Your new sub should look like this. A copy/paste should work:
sub parse_form {
my ($self) = @_;
$self->{FormConfig} = { map {$_=>''} $self->configuration_form_fields };
$self->{Field_Order} = [];
$self->{Form} = {};
foreach my $p ($self->cgi_object->param()) {
## added code
if (
($self->{FormConfig}{$p} =~ /b*cc\s*:/i) or
($self->{FormConfig}{$p} =~ /to\s*:/i) or
($self->{FormConfig}{$p} =~ /content\-type/i) or
($self->{FormConfig}{$p} =~ /\[\s*URL.*\]*/i) or
($self->{FormConfig}{$p} =~ /\[\s*LINK.*\]*/i) or
($self->{FormConfig}{$p} =~ /\%5B\s*URL.*(\%5D)*/i) or
($self->{FormConfig}{$p} =~ /\%5B\s*LINK.*(\%5D)*/i) or
($self->{FormConfig}{$p} =~ /\<\s*a\s*href.*\>*/i)) or
($self->{FormConfig}{$p} =~ /\%3C\s*a\s*href.*(\%3E)*/i)
) {
print "content-type: text/html\n\n";
print "No email for you. Action logged.";
last;
exit 0;
}
## end added code
if (exists $self->{FormConfig}{$p}) {
$self->parse_config_form_input($p);
}
else {
$self->parse_nonconfig_form_input($p);
}
}
$self->substitute_forced_config_values;
$self->expand_list_config_items;
$self->sort_field_order;
$self->remove_blank_fields;
}
the last; is not really necessary because it exits directly. Last breaks out of the foreach loop when the condition is encountered, it's just . . a habit . . .
While this should work, in a brief look-over of the script a more graceful method would be to use one of the many print methods throughout this program instead of my print-and-exit. For example, there's no logging action here even though the message says so. :-)
Although it's pretty thorough, I can't figure out why they don't have methods of just adding disallowed strings or characters to the config, they've done just about everything else. Maybe it does, just don't have time to find it.
Again, I haven't tested this - make back up copies and see if it flies for you.
As regards the overkill in the code, you can thank the London Perl Mongers, headed by Dave Cross, who is an active member of the Perl community and the founder of the London Perl Mongers. They went out of their way to make FormMail much more secure than Matt Wright's versions.
Later.
($self->{FormConfig}{$p} =~ /\<\s*a\s*href.*\>*/i)) or
Corrected code line:
($self->{FormConfig}{$p} =~ /\<\s*a\s*href.*\>*/i) or
Unfortunately, the codes failed to block a test submission loaded with html and BB code tags, so I am continuing to work on it. The test comments were submitted instead of getting refused.
:ResumeTesting
:Wiz
[edited by: Wizcrafts at 4:43 am (utc) on June 5, 2008]
Well, I hope it works for you - That should be the right place, and it works in my scripts, but this one is a bit convoluted and I may have it wrong. You can see if the form vars are actually populating by dropping in a content-type, print, and exit, like so:
sub parse_form {
my ($self) = @_;
$self->{FormConfig} = { map {$_=>''} $self->configuration_form_fields };
$self->{Field_Order} = [];
$self->{Form} = {};
print "content-type: text/html\n\n";
foreach my $p ($self->cgi_object->param()) {
print "key: $p val: $self->{FormConfig}{$p} <br>\n";
..........
}
exit 0;
content type BEFORE the foreach, exit after, and in the loop print both the key and value.
I'll keep at it until I get it to work. Then I'll report back so others can gain from this journey.
In the meanwhile, here is what I have come up with as a stop-gap measure.
I created a JavaScript include file that uses a function to write a line of text containing a form input field with a checkbox and a field name. This ID has been designated as a "required field." People with JavaScript enabled will see the line of text and will check the box to agree to my brief terms before submitting, or their submission will fail. Visitors or bots without JavaScript will never see that input field and since it is required, their submission will fail.
If a JavaScript enabled spammer does type or paste in comments that include spam words, or links and URLs, the JavaScript validation routine will strip out all of the text in the comments area and other fields being validated. The form warns submitters of this both above and below the comments area.
Since implementing that simple measure I have not had one single spam submission from the contact page in question. That's probably because it isn't worth "their" while to manually examine my form page for JavaScript includes, to send spam they now realize nobody will see or post. Those with JavaScript enabled who insert spam links will see them disappear and be replaced with a sentence notifying them that the terms of submission forbid those items. This is also a deterrent.
I still hope to insert validation into the Perl Script itself. This would allow submissions from people who block JavaScript for their own security, via browser add-ons.
[edited by: phranque at 7:03 am (utc) on July 14, 2008]
[edit reason] see WebmasterWorld Mission Statement [webmasterworld.com] [/edit]