The test guestbook gets some spam I want to stop. We already installed a bot-trap and I am busy with some other things to try stop it and make the site a bit more secure.
Can somebody help me with writing good syntax for the cgi script?
The html form has several fields including:
field1 <input TYPE="text" NAME="url" SIZE="40" />
field2 <textarea name=comments COLS=60 ROWS=4></textarea>
field1 is a trap test field that should be left empty but is made attractive for spambots
field2 is the standard comment field, but our users are warned not to mention a website there
What I want is to write in the guest.cgi file is a code line that:
- if field1 has ANY input (is not empty)
OR
- if an url is entered in field2 (in any form of url, link, http:/ or www...)
the script will stop and return an "internal server error" message.
I read a lot about it, but could not find an example of how to write this.
Any help would be greatly appreciated.
Robo
print "Comments: $data{'comments'}<br>\n";
So, if you have the correct hash, somewhere near the top of your script,
if ($data{'url'} ne '') {
print "content-type: text/html\n\n";
print "Unsuspected input detected, program terminated.";
exit 0;
}
This will not return a server error (which will add unnecessary junk to your error logs) but it will exit the program. Mind you, this is not a "catch all" and if spammers figure out not to populate url, they'll get around it. Dig around here, lots of info on form abuse.
I tried your suggestion
if ($INPUT{'url'} ne '') {
print "content-type: text/html\n\n";
print "Unsuspected input detected, program terminated.";
exit 0;
}
According the CGI script checker that is not correct syntax, could not work out what was wrong, gave a downright error message regardless what you filled or let everything through. Using your and other info from this forum, I finally changed the file as following:
Old
foreach $line (@lines) {
if ($line =~ /<!--begin-->/) {
New
foreach $line (@lines) {
if ($INPUT{'url'}) {
exit;
}
if ($INPUT{'website'}) {
exit;
}
if ($line =~ /<!--begin-->/) {
If in the new file a blank field is filled, that returns an Internal Server Error, and nobody is wiser why. The fields are not hidden but in plain sight, with at the beginning of the form a remark not to fill those fields. Unfortunately, I do not get info on who was hammering at my door through the file, but I worry about that later.
As extra, I remade the guestbook.htm and sign.htm in what.php and where.php, and renamed the functional guest.cgi script as well. I left copies of the old files with the old names, also linked but not visible, on the site.
Hopefully, the bots will spent their time on the old guestbook (which nobody ever sees) and not on the protected one.
Question: I also have a bot-trap installed. Why do these bots not fall inot the trap but go for the guestbook? Very smart ones?
Thanks for the suggestions; at least I have something that blocks some spam.
If I fill the fake fields in the guestbook, nothing comes through and it gives a server error. However, when I then hit the "back" button in firefox and re-submit the data, the browsers shows the normal repeat of the submitted data, seemingly the spammer is getting through this time. On top of that, my guestbook file is wiped, showing a file size of zero, and the website shows no page!?.
Any idea?
adb64, what kind of scrip[t is handling the guestbook on the server side?
I tried your suggestionif ($INPUT{'url'} ne '') {
print "content-type: text/html\n\n";
print "Unsuspected input detected, program terminated.";
exit 0;
}
i think you need to use that value in string (not scalar) context for that test to work:
if ('$INPUT{'url'}' ne '') {
...
the server error is probably because you did the exit without writing a valid html header.
check your server error log for clues.
you haven't explained what the 'website' parameter is or should be or what is in @lines and there isn't enough information to figure out the latest issue.
My guestbook is completely serverside in PHP, no client side scripting is used.
I've split it up in two scripts, a guestbook engine script that handles basic guestbook functions like reading the guestbook records from a MySQL database and writing new messages to the database. Also basic administrator functions like editing and deleting messages are done there.
A second script acts like a template and handles all guestbook specific things. This means that in principle I can add multiple guestbooks, all using the same guestbook engine, but each can have its own look and feel and behavior.
From this guestbook specific script the functions from the guestbook engine are called.
Things like bot and bad language detection and interfacing with the visitor tracking system is guestbook specific.
i think you need to use that value in string (not scalar) context for that test to work:
if ('$INPUT{'url'}' ne '') {
...
Not quite clear how to do that properly.
Below is the CGI error log, which was reset before the test. The form and guestbook cleared and functional. Submitted the form with "leave blank" fields filled, this is the resulting error log (without the fields filled, it works ok).
----
%% [Thu Mar 20 11:42:11 2008] POST /cgi-bin/remarks.cgi HTTP/1.1
%% 500 /big/dom/xcmydomain/cgi-bin/remarks.cgi
%request
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Encoding: gzip,deflate
Accept-Language: en-us,en;q=0.5
Connection: keep-alive
Content-Length: 174
Content-Type: application/x-www-form-urlencoded
Host: www.mydomain.org
Keep-Alive: 300
Referer: [mydomain.org...]
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
name=robert&email=info%40mydomain.org&url=www.mydomain2.org&city=&state=&country=USA&howfound=&co mments=test+spam+clear+log&alias=Robo&website=httpwww.mydomain2.org&pubcom=
%response
----
The guestbook file to which the message should be written is wiped empty, but the script not abandoned completely because the webmaster address did get an email notification that I had submitted something.
Below the relevant script parts from the very start of the script. To keep this text short, it gives only the section start and end. Standard command lines are deleted because they were unchanged and functional, also empty lines. The file was run through a CGI checker and shows no errors.
Hope this enables a better insight. I tested with removing the two culprit statements : if...exit: and everything is back to normal, including spam coming in. How to these two lines extra can force the script to delete all content on the guestbook.php?
-------
$mailprog = '/usr/lib/sendmail';
$getdate = "/bin/date";
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s/<!--(.¦\n)*-->//g;
$value =~ s/<([^>]¦\n)*>//g;
$INPUT{$name} = $value;
}
$date = `$getdate +"%A, %B %d, %Y at %T (%Z)"`;
chop($date);
&noname unless $INPUT{'name'};
open (FILE,"$guestpath");
@lines = <FILE>;
close(FILE);
open (GUEST,">$guestpath");
foreach $line (@lines) {
.....these two lines make everything go crazy...
if ($INPUT{'url'}) {
exit;
}
if ($INPUT{'website'}) {
exit;
}
.....kick these two lines out and everything goes back to normals
if ($line =~ /<!--begin-->/) {
print GUEST ("<!--begin-->\n");
-----------LINES DELETED-----------
} else {
print GUEST ("$line");
}
}
close (GUEST);
open (MAIL, "¦$mailprog $youmail");
if ($INPUT{'email'}) {
-----------LINES DELETED-----------
print MAIL ("-$date\n");
close (MAIL);
&htmlafter;
# Error Messages
sub noname {
-----------LINES DELETED-----------
exit;
}
# Print Follow Up HTML
sub htmlafter {
print ("Content-Type: text/html\n\n");
print ("<html><head><title>Thank You</title></head>\n");
-----------LINES DELETED-----------
print ("-$date<hr>");
print ("<a href=\"$guestlocation\">Back to Homepage</a>. \n");
print ("If you do not see your addition, hit RELOAD<br>\n");
exit;
}
Anybody any wiser?
This script executes inline, meaning it starts from the top and works down. When you get to this line,
&noname unless $INPUT{'name'};
It says, "if there is no value for "name" go to the subroutine noname and execute it."
sub noname {
-----------LINES DELETED-----------
exit;
}
I don't know what you deleted, but this will always generate a server error. This is because your content type header has not been generated. You have to always print something out with a content-type when exiting or you will get a server error. So you need this:
sub noname {
print "content-type: text/html\n\n";
print "no name was provided.";
exit 0;
}
Now, as to this part:
if ($INPUT{'url'}) {
exit;
}
if ($INPUT{'website'}) {
exit;
}
Since you've said "url" is a hidden field in your form, it will always exist. It's just a blank string. :-) So "if ($INPUT{'url'})" will (almost*) always be true, and will always do whatever's inside the "if." Second, once it gets inside that if it will cause an internal server error for the reasons mentioned above, which is an unnecessary clogging of your error logs.
Following the principles above, I would remove the quoted lines above and patch it as follows:
&noname unless $INPUT{'name'};
if (($INPUT{'url'} ne '') or ($INPUT{'website'} ne '')) {
&bot_fields;
}
sub bot_fields {
print "content-type: text/html\n\n";
print "Unexpected input, exiting program";
exit 0;
}
There are a lot of other things wrong with this script, but that will patch it up.
* There are conditions that make this a false statement depending on how the parsing is handled, so "if ($variable)" may indeed be fine here. In any case, "if ($variable ne '')" would be reliable and should fix this.
As to the other remark, I was a bit too enthusiast with deleting in the previous post to save space; that probably gave the impression it was a crappy script. At the end is that section written in full.
A related question.
In an earlier thread from 2006 by Johhnie (http://www.webmasterworld.com/webmaster/3160394.htm), he mentions a mechanism which prevents the user from posting a message within 10 seconds of loading the page. That sounds like a good idea, most people need at least a few minutes to read and fill a form.
Unfortunately, that thread was very short and the code for that trick was not mentioned. I could not find a follow-up. Before I try find out more, with today’s smart bots does that solution still makes sense?
Thanks for all the replies till now
Mcduff
Full non-name section:
sub noname {
print ("Content-type: text/html\n\n");
print ("<html><head><title>No Name</title></head>\n");
print ("<body><h1>You Didn't Leave Your Name...</h1>\n");
print ("You didn't add your name so your entry to the guestbook was not added.\n");
print ("Please add your name below.<br>\n");
print ("<FORM method=\"post\" action=\"$cgilocation\">\n");
print ("Your Name:<input type=text name=\"name\" size=30><br>\n");
print ("<input type=hidden name=\"email\" value=\"$INPUT{'email'}\">\n");
print ("<input type=hidden name=\"city\" value=\"$INPUT{'city'}\">\n");
print ("<input type=hidden name=\"state\" value=\"$INPUT{'state'}\">\n");
print ("<input type=hidden name=\"country\" value=\"$INPUT{'country'}\">\n");
print ("<input type=hidden name=\"howfound\" value=\"$INPUT{'howfound'}\">\n");
print ("<input type=hidden name=\"comments\" value=\"$INPUT{'comments'}\">\n");
print ("<input type=hidden name=\"alias\" value=\"$INPUT{'alias'}\">\n");
print ("<input type=hidden name=\"pubcom\" value=\"$INPUT{'pubcom'}\">\n");
print ("<input type=\"submit\" VALUE=\"Sign Guestbook\"><hr>\n");
print ("</body></html>\n");
exit;
}