homepage Welcome to WebmasterWorld Guest from 54.234.2.94
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / New To Web Development
Forum Library, Charter, Moderators: brotherhood of lan & mack

New To Web Development Forum

    
Preventing guest book problem
Webbot spamming my guest book?
mom2x2x2




msg:963221
 3:47 pm on Oct 25, 2005 (gmt 0)

Help- just starting out on working on a web site using FrontPage. Not sure if this is the right terminology but I think a webbot or crawler found my website and keeps leaving long lists of "junk" overseas websites on my guest book. Is there something I can change on the guest book form to stop this? Thanks for any help!

 

wickydoodah




msg:963222
 5:32 pm on Oct 25, 2005 (gmt 0)

As a "former" user of FrontPage guest books, I learned just how insecure they really are. The sad fact is that anyone (or any webbot) can easily spam FP guest books since MS never provided for any security provisions in it. The reality is that anyone can spam any FP guest book just by knowing the name of your guestlog file and it's location, and they can get that through a Google search and looking at your page's source code. No amount of Javascript or image verification scripts will stop it either. I'm afraid the only real solution you have is to find another guest book script with built-in security.

Hope that helps...

JAB Creations




msg:963223
 7:07 am on Oct 26, 2005 (gmt 0)

Welcome to WebmasterWorld!

Easy...
(I'll try to answer some newb questions as it's your first post and I'm not sure how much you know).

There are a few things you can do...

1.) The easiest way...
The problem you have is caused by computers set to automatically look for static urls. That means regardless of WHAT domain you have, the path will be the same to the guestbook...

In example...
www.example1.com/cgi-bin/guestbook.cgi
www.example2.com/cgi-bin/guestbook.cgi
www.example3.com/cgi-bin/guestbook.cgi

So what I suggest the easiest and quickest way of defeating these *** is just RENAME the script file! Add a dash between guest and book for example, or change it to gues1tbook.cgi... your choice! This should throw off most bots!

WHY?! If you LIVE with your nose in access logs like I do you'll QUICKLY understand that spammers typically try to spam as much as possible as quick as possible. They won't crawl your entire site looking for something (like a guest book or contact page).

Slightly more advanced spammers have a two robot setup. The first bot crawls a search engine (google/msn/yahoo for example) looking for hot pages (anything they consider to have a high chance of having email addys). That bot will compile a list of urls.

The second bot will recieve this list of urls to crawl and therefor cut down on the time it takes to find the email addys. This *** will now request the files the first bot thinks email addresses are on.

How do I know this?

When I first started adapting to spammers I noticed only ONE hit on certain hot files on my server. I know that a HUMAN who is NOT up to bad things would also have (passively) requested javascript, css, and other linked files frmo the guestbook, and ALSO have visited other pages first. The spammer sticks out like a sore thumb!

3.) Don't open doors for strangers!

A useragent string is what you send everytime you request a file. Here is my useragent...

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051025 Firefox/1.5

Think of it like my browser knocking on your servers door and saying, "Hi, I'm Firefox 1.5 and I'm here to pick up hotchick.gif for John".

Now sometimes the useragent string is messed around with or doesn't even exist! In your access log it would look something like this...

(Notice some information is Xed out..
2xx.2xx.1xx.1xx - - [02/Aug/20xx:2x:1x:2x +0#*$!] "GET /cgi-bin/guestbook.pl HTTP/1.1" 200 2060 "-" "-"

Here is a simple question...

If I knock on your door, you ask "who is it?", and I fail to reply, would you open your door? 10 bucks says no!

So here is the simple idea...if someone requests something without telling you who they are you can use a method (if you use PHP) to NOT answer and even log the request as forbidden!

<?php $UA=getenv("HTTP_USER_AGENT"); if ($UA==""){header("HTTP/1.0 403");die();}?>

That script will KILL php and send a completely blank file to the mystery person.

Spammers will catch on of course but the ultimate victory will be the content provider. Everything I expect spammers will attempt to do to get around even these methods I can still EASILY counter while still allowing normal humans to access the hot files with little or no resistence.

2.) Traps for bad bots!

You can setup traps for bad robots. For example good robots will always follow the www.yourdomain.com/robots.txt file. You can tell bots NOT to crawl a certain place, and then put an invisable link humans wont see to that location. A script can be used to detect the bot crawling the trap and add it to be denied by the Apache web server (or other web server software). Apache btw is software that makes a computer in to a web server.

Hope this helps and remember, we ultimately have more cards up our sleeves then the scum of our pound! ;)

John

[edited by: eelixduppy at 9:15 pm (utc) on Feb. 18, 2009]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / New To Web Development
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved