homepage Welcome to WebmasterWorld Guest from 54.198.42.105
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

    
url ban in guestbook
Need help writing a few simple lines in CGI script
Robo




msg:3605243
 1:47 pm on Mar 19, 2008 (gmt 0)

I work with two non-profits and made our websites including a simple guestbook (html form and guest.cgi for handling it). I am not a designer nor a programmer, but reasonably good in copying and pasting and in combining snippets of information and script into something that works.

The test guestbook gets some spam I want to stop. We already installed a bot-trap and I am busy with some other things to try stop it and make the site a bit more secure.

Can somebody help me with writing good syntax for the cgi script?

The html form has several fields including:
field1&nbsp;<input TYPE="text" NAME="url" SIZE="40" />
field2&nbsp;<textarea name=comments COLS=60 ROWS=4></textarea>

field1 is a trap test field that should be left empty but is made attractive for spambots
field2 is the standard comment field, but our users are warned not to mention a website there

What I want is to write in the guest.cgi file is a code line that:

- if field1 has ANY input (is not empty)
OR
- if an url is entered in field2 (in any form of url, link, http:/ or www...)

the script will stop and return an "internal server error" message.

I read a lot about it, but could not find an example of how to write this.

Any help would be greatly appreciated.
Robo

 

rocknbil




msg:3605440
 4:36 pm on Mar 19, 2008 (gmt 0)

You will find the input variables are stored in some hash. In many free scripts, this is %data or %form. Look for references to valid fields, you should see something like

print "Comments: $data{'comments'}<br>\n";

So, if you have the correct hash, somewhere near the top of your script,

if ($data{'url'} ne '') {
print "content-type: text/html\n\n";
print "Unsuspected input detected, program terminated.";
exit 0;
}

This will not return a server error (which will add unnecessary junk to your error logs) but it will exit the program. Mind you, this is not a "catch all" and if spammers figure out not to populate url, they'll get around it. Dig around here, lots of info on form abuse.

Robo




msg:3605679
 7:29 pm on Mar 19, 2008 (gmt 0)

It is an adapted Dream Catchers guestbook version 2, uses $INPUT

I tried your suggestion

if ($INPUT{'url'} ne '') {
print "content-type: text/html\n\n";
print "Unsuspected input detected, program terminated.";
exit 0;
}

According the CGI script checker that is not correct syntax, could not work out what was wrong, gave a downright error message regardless what you filled or let everything through. Using your and other info from this forum, I finally changed the file as following:

Old
foreach $line (@lines) {
if ($line =~ /<!--begin-->/) {

New
foreach $line (@lines) {
if ($INPUT{'url'}) {
exit;
}
if ($INPUT{'website'}) {
exit;
}
if ($line =~ /<!--begin-->/) {

If in the new file a blank field is filled, that returns an Internal Server Error, and nobody is wiser why. The fields are not hidden but in plain sight, with at the beginning of the form a remark not to fill those fields. Unfortunately, I do not get info on who was hammering at my door through the file, but I worry about that later.

As extra, I remade the guestbook.htm and sign.htm in what.php and where.php, and renamed the functional guest.cgi script as well. I left copies of the old files with the old names, also linked but not visible, on the site.

Hopefully, the bots will spent their time on the old guestbook (which nobody ever sees) and not on the protected one.

Question: I also have a bot-trap installed. Why do these bots not fall inot the trap but go for the guestbook? Very smart ones?

Thanks for the suggestions; at least I have something that blocks some spam.

adb64




msg:3605753
 8:41 pm on Mar 19, 2008 (gmt 0)

In the past I've made my own guestbook script (in PHP) and also saw bots entering spam messages, so I took some measures to prevent that and I must say they work very well. I haven't seen any spam anymore in the past few months.
What I did to ban those bots from my guestbook is the following:
  • Have a robots metatag "noindex, nofollow" on each guestbook page so it can't be found in search engines.
  • I've linked the guestbook with my visitors tracking scripts and the guestbook can never be the first page by which a 'visitor' (read bot) enters my site. If it does it is presented with a 403. Human visitors will hardly ever fall in this trap as, because of the first point, they won't enter the site from a search engine.
  • Per session a 'visitor' can only enter one message in the guestbook.
  • A message may contain at most 3 urls, when it has more the message is rejected and also no new message can be entered during the same session. This of course then also works for human spammers.
  • I've installed a filter for bad language and words. If a message contains one of those words it is forced to be a hidden message which is previewed before making it visible to the public

Robo




msg:3605801
 9:52 pm on Mar 19, 2008 (gmt 0)

Now I have a strange problem coming up.

If I fill the fake fields in the guestbook, nothing comes through and it gives a server error. However, when I then hit the "back" button in firefox and re-submit the data, the browsers shows the normal repeat of the submitted data, seemingly the spammer is getting through this time. On top of that, my guestbook file is wiped, showing a file size of zero, and the website shows no page!?.

Any idea?

adb64, what kind of scrip[t is handling the guestbook on the server side?

phranque




msg:3606007
 3:06 am on Mar 20, 2008 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], Robo!

I tried your suggestion

if ($INPUT{'url'} ne '') {
print "content-type: text/html\n\n";
print "Unsuspected input detected, program terminated.";
exit 0;
}

i think you need to use that value in string (not scalar) context for that test to work:
if ('$INPUT{'url'}' ne '') {
...

the server error is probably because you did the exit without writing a valid html header.
check your server error log for clues.
you haven't explained what the 'website' parameter is or should be or what is in @lines and there isn't enough information to figure out the latest issue.

adb64




msg:3606226
 10:55 am on Mar 20, 2008 (gmt 0)

Robo,

My guestbook is completely serverside in PHP, no client side scripting is used.
I've split it up in two scripts, a guestbook engine script that handles basic guestbook functions like reading the guestbook records from a MySQL database and writing new messages to the database. Also basic administrator functions like editing and deleting messages are done there.

A second script acts like a template and handles all guestbook specific things. This means that in principle I can add multiple guestbooks, all using the same guestbook engine, but each can have its own look and feel and behavior.
From this guestbook specific script the functions from the guestbook engine are called.
Things like bot and bad language detection and interfacing with the visitor tracking system is guestbook specific.

Robo




msg:3606553
 5:35 pm on Mar 20, 2008 (gmt 0)

Quote from Phranque (thanks for the welcome)

i think you need to use that value in string (not scalar) context for that test to work:
if ('$INPUT{'url'}' ne '') {
...

Not quite clear how to do that properly.

Below is the CGI error log, which was reset before the test. The form and guestbook cleared and functional. Submitted the form with "leave blank" fields filled, this is the resulting error log (without the fields filled, it works ok).

----
%% [Thu Mar 20 11:42:11 2008] POST /cgi-bin/remarks.cgi HTTP/1.1
%% 500 /big/dom/xcmydomain/cgi-bin/remarks.cgi
%request
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Encoding: gzip,deflate
Accept-Language: en-us,en;q=0.5
Connection: keep-alive
Content-Length: 174
Content-Type: application/x-www-form-urlencoded
Host: www.mydomain.org
Keep-Alive: 300
Referer: [mydomain.org...]
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12

name=robert&email=info%40mydomain.org&url=www.mydomain2.org&city=&state=&country=USA&howfound=&co mments=test+spam+clear+log&alias=Robo&website=httpwww.mydomain2.org&pubcom=
%response
----

The guestbook file to which the message should be written is wiped empty, but the script not abandoned completely because the webmaster address did get an email notification that I had submitted something.

Below the relevant script parts from the very start of the script. To keep this text short, it gives only the section start and end. Standard command lines are deleted because they were unchanged and functional, also empty lines. The file was run through a CGI checker and shows no errors.

Hope this enables a better insight. I tested with removing the two culprit statements : if...exit: and everything is back to normal, including spam coming in. How to these two lines extra can force the script to delete all content on the guestbook.php?

-------
$mailprog = '/usr/lib/sendmail';
$getdate = "/bin/date";

read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s/<!--(.¦\n)*-->//g;
$value =~ s/<([^>]¦\n)*>//g;
$INPUT{$name} = $value;
}

$date = `$getdate +"%A, %B %d, %Y at %T (%Z)"`;
chop($date);
&noname unless $INPUT{'name'};

open (FILE,"$guestpath");
@lines = <FILE>;
close(FILE);

open (GUEST,">$guestpath");

foreach $line (@lines) {
.....these two lines make everything go crazy...
if ($INPUT{'url'}) {
exit;
}
if ($INPUT{'website'}) {
exit;
}
.....kick these two lines out and everything goes back to normals
if ($line =~ /<!--begin-->/) {
print GUEST ("<!--begin-->\n");
-----------LINES DELETED-----------
} else {
print GUEST ("$line");
}
}

close (GUEST);

open (MAIL, "¦$mailprog $youmail");

if ($INPUT{'email'}) {
-----------LINES DELETED-----------
print MAIL ("-$date\n");
close (MAIL);

&htmlafter;

# Error Messages

sub noname {
-----------LINES DELETED-----------
exit;
}

# Print Follow Up HTML

sub htmlafter {
print ("Content-Type: text/html\n\n");
print ("<html><head><title>Thank You</title></head>\n");
-----------LINES DELETED-----------
print ("-$date<hr>");
print ("<a href=\"$guestlocation\">Back to Homepage</a>. \n");
print ("If you do not see your addition, hit RELOAD<br>\n");
exit;
}

Anybody any wiser?

rocknbil




msg:3607499
 6:49 pm on Mar 21, 2008 (gmt 0)

Well, I still say my original solution would work. Look at this.

This script executes inline, meaning it starts from the top and works down. When you get to this line,

&noname unless $INPUT{'name'};

It says, "if there is no value for "name" go to the subroutine noname and execute it."

sub noname {
-----------LINES DELETED-----------
exit;
}

I don't know what you deleted, but this will always generate a server error. This is because your content type header has not been generated. You have to always print something out with a content-type when exiting or you will get a server error. So you need this:

sub noname {
print "content-type: text/html\n\n";
print "no name was provided.";
exit 0;
}

Now, as to this part:

if ($INPUT{'url'}) {
exit;
}
if ($INPUT{'website'}) {
exit;
}

Since you've said "url" is a hidden field in your form, it will always exist. It's just a blank string. :-) So "if ($INPUT{'url'})" will (almost*) always be true, and will always do whatever's inside the "if." Second, once it gets inside that if it will cause an internal server error for the reasons mentioned above, which is an unnecessary clogging of your error logs.

Following the principles above, I would remove the quoted lines above and patch it as follows:

&noname unless $INPUT{'name'};
if (($INPUT{'url'} ne '') or ($INPUT{'website'} ne '')) {
&bot_fields;
}

sub bot_fields {
print "content-type: text/html\n\n";
print "Unexpected input, exiting program";
exit 0;
}

There are a lot of other things wrong with this script, but that will patch it up.

* There are conditions that make this a false statement depending on how the parsing is handled, so "if ($variable)" may indeed be fine here. In any case, "if ($variable ne '')" would be reliable and should fix this.

Robo




msg:3608505
 2:42 pm on Mar 23, 2008 (gmt 0)

Rocknbil, thanks for the explanations and corrections. I added what you said, now it seems to work ok, I will test it for a few days and see what happens.

As to the other remark, I was a bit too enthusiast with deleting in the previous post to save space; that probably gave the impression it was a crappy script. At the end is that section written in full.

A related question.
In an earlier thread from 2006 by Johhnie (http://www.webmasterworld.com/webmaster/3160394.htm), he mentions a mechanism which prevents the user from posting a message within 10 seconds of loading the page. That sounds like a good idea, most people need at least a few minutes to read and fill a form.

Unfortunately, that thread was very short and the code for that trick was not mentioned. I could not find a follow-up. Before I try find out more, with today’s smart bots does that solution still makes sense?

Thanks for all the replies till now
Mcduff

Full non-name section:

sub noname {

print ("Content-type: text/html\n\n");
print ("<html><head><title>No Name</title></head>\n");
print ("<body><h1>You Didn't Leave Your Name...</h1>\n");
print ("You didn't add your name so your entry to the guestbook was not added.\n");
print ("Please add your name below.<br>\n");
print ("<FORM method=\"post\" action=\"$cgilocation\">\n");
print ("Your Name:<input type=text name=\"name\" size=30><br>\n");
print ("<input type=hidden name=\"email\" value=\"$INPUT{'email'}\">\n");
print ("<input type=hidden name=\"city\" value=\"$INPUT{'city'}\">\n");
print ("<input type=hidden name=\"state\" value=\"$INPUT{'state'}\">\n");
print ("<input type=hidden name=\"country\" value=\"$INPUT{'country'}\">\n");
print ("<input type=hidden name=\"howfound\" value=\"$INPUT{'howfound'}\">\n");
print ("<input type=hidden name=\"comments\" value=\"$INPUT{'comments'}\">\n");
print ("<input type=hidden name=\"alias\" value=\"$INPUT{'alias'}\">\n");
print ("<input type=hidden name=\"pubcom\" value=\"$INPUT{'pubcom'}\">\n");
print ("<input type=\"submit\" VALUE=\"Sign Guestbook\"><hr>\n");
print ("</body></html>\n");

exit;

}

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved