Forum Moderators: phranque

Message Too Old, No Replies

How do I stop spam on email forms?

         

Mel3020

4:37 am on Jan 5, 2009 (gmt 0)

10+ Year Member



I have read some threads on this forum with ideas about how to stop the constant flow of spam emails through contact/guestbook forms, but the problem is that I don't understand a lot of it. I am very new to web design and I just don't understand some of the terminology or how to implement some of the ideas given. I'm using Expression Web to design my website and I created a 'contact us' form using the form control feature. Can anyone please give me advice on how to prevent spammers from invading my inbox in simple terms? I would be very, very grateful!

incrediBILL

1:42 am on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



chances are no one is going to bother coming up with an independent set of rules for your site.

My site is 100% custom and someone not only made a rule for my site, but kept adjusting it as I made adjustments to my code to avoid their adaptations, and I watched them hacking at the site as I tracked each attempt by IP.

First I required cookies and the spam stopped.

They started accepting cookies.

Then I required the proper page referrer.

They started sending the proper page referrer.

Then I put in an initially simplistic hidden field with a random number served up per page form impression per session.

They started reading the hidden field content and submitting it.

I moved the field content of the hidden field into javascript and filled it in on submit.

They picked the field content out of the javascript and submitted it.

Then I added a formula that creates hidden field content based on what's actually typed using the keyboard events.

Apparently that got too complicated for them at this time and the spam stopped 2 years ago.

MWpro

5:25 am on Jan 12, 2009 (gmt 0)

10+ Year Member



I just don't know how to make my form 'kick out' any email that doesn't answer the question correctly.

I don't think you understand. The screening is done before the server will even get to the mail function.

Let's say your question is: what is the sum of 2 + 6?

(I don't know what programming language you are using so I'll keep it general)

if(user submits 8){
do the form mailing stuff;
}

This way, the form will only ever mail if they get the question correct. If they don't, nothing happens, no mail is sent.

For vbulletin forums, there are several plugins that work like a charm. I use two: one first makes the user choose a picture out of four that fits the prompt. For example, it says "click the soccer ball" and then shows the images of a soccer ball, a cloud, bart simpson, george bush.

The second one is a custom question plugin that allows you to choose 10 custom questions and randomizes them.

These two plugins work like a charm.

[edited by: MWpro at 5:27 am (utc) on Jan. 12, 2009]

kapow

7:27 pm on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IncrediBILL
Then I added a formula that creates hidden field content based on what's actually typed using the keyboard events.
That sounds interesting - can you say a bit more about how to do that? It is based on the idea that a robot doesn't use a keybourd > so no key-board events?

incrediBILL

2:52 am on Jan 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Here's a simple sample of how I use javascript to defeat bots.

Note that the HTML form has no destination URL of where to submit the field. The URL is embedded in the GoURL function and the path is specific for the current session only so the form submit code cannot know in advance the destination. If you aren't using PHP and need a static path, you can do that but it's easier to crack.

Then there's a hidden field in the form called "humancheck" which is preloaded with a value that is replaced with the a specific session value. The trick here is you can dynamically require or eliminate the humancheck field so the bot has no clue whether or not your script is looking for the default value shown in the form or the replacement value (if present) in GoURL.

The countkeys function below is very primitive and counts every keystroke, backspace, delete, even the submit button. For that value to be valid it merely needs to exceed the length of whatever was typed into the the "data" field. You can pass the event information to this function and make a more sophisticated keystroke checksum, make several variants that are randomly used, endless possibilities.

So far this simple code and the accompanying server side checks has kept the spammers away for a long time.


function GoURL(form) {
countkeys(form);
form.action="http://www.example.com/form1234";
form.humancheck.value="sessionvalue-54321";
form.submit();
}

function countkeys(form) {
form.keycount.value = Math.round(form.keycount.value) + 1;
}


<form method="post" name="myform">
<input type=text name="data" size=50 onkeypress="countkeys(this.form)">
<input type=submit onclick="GoURL(this.form); return(false);" name="SUBMIT" value="Submit">
<input type="hidden" name="keycount" value="0">
<input type=hidden name="humancheck" value="InitialDataValue-01234">
</form>

It may look easy enough to defeat but I said it's a simple sample.

Run your javascript through an obfuscator and it's not easy to identify values and defeat your code.

Use several variations on the "countkeys" function and serve them up dynamically so the spambot doesn't have a chance of knowing what's coming.

Randomly change field names so the spambot doesn't know what to submit where.

Your user will know where to type based on the text, and your server will know which field is which based on the session.

Just keep it moving as moving targets are harder to hit.

Thylacine

8:04 am on Jan 13, 2009 (gmt 0)

10+ Year Member



I agree that using several of the approaches mentioned in this thread will be very effective. I've cut spam registrations from dozens per day to nearly zero on several forums I run using a variety of techniques.

I've had the best luck using form fields that are subsequently analyzed on the server using PHP or another server-side scripting language (as opposed to using JavaScript with client-side validation). My favorite approach is to hide a few fields in the form using CSS (lots of ways to do this), and then naming the fields things that the spam bots would like, say, "email" or "address" or "message." Since the form field isn't seen by human visitors, it's not filled out. Since all the right code is there for the spam bots, they fill it out. (The bots generally aren't smart enough to analyze an external CSS file to figure out whether or not the field actually shows up visually.) Upon submission, a php routine in a separate file (one that can't be accessed directly except through the form submission) checks to see if anything is in the non-visible fields. If anything is there, it aborts the submission and, instead, redirects the spam bot off to some dead-end IP address.

Another thing to consider is obfuscating the links to the form by not giving them names that spam bots would look for. For example, if the URL to your form was, say, whatever.com/contact.html, well, a spam bot could easily identify it as it crawled through your site. You might also want to use HTML entities in the words associated with those links to spell words like "e-mail" or "contact" or other words that bots might key in on as links associated with forms. All browsers can translate the HTML entities into actual letters, but most spam bots seem stumped by them.

Lots of bots skip right past actually filling out the forms and dealing with any subsequent JavaScript validation. Instead, they analyze the form, then submit their garbage directly to the script that generates whatever process is next, say, generating an e-mail. This, depending on how you do it, makes JavaScript validation much less useful, since they simply bypass the client-side RegEx validation. This is why a server-side scripting language might be better for the validation or analysis. I'd also be sure to not include this behind-the-scenes analysis or validation in the file that holds the HTML form itself. Instead, place this script in a separate file that can't be read directly, so it can't be analyzed by the bot and subsequently bypassed with a direct injection.

When it comes right down to it, there are all kinds of giveaway behaviors and artifacts that can be used to identify a bot as opposed to a human. It's just a matter of writing a script (preferably server executed) that identifies these giveaway behaviors and disallows them. Human spammers are another problem, though, but they're much less numerous.

incrediBILL

8:50 am on Jan 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This, depending on how you do it, makes JavaScript validation much less useful, since they simply bypass the client-side RegEx validation.

That's perfect because my server-side validation only works best if they skip the javascript in the first place.

Bots typically don't run javascript unless the bot is driving an actual browser API which is too slow to make mass spamming feasible, which is why the client side code helps verify the backend server-side results.

It's all combined to make sure you're 99% positive it's a bot and not a false positive.

[edited by: incrediBILL at 8:51 am (utc) on Jan. 13, 2009]

jeffatrackaid

10:55 pm on Jan 14, 2009 (gmt 0)

10+ Year Member



Server side validation is very reliable. While captcha is a pain, it can be highly effective when implemented properly.

The other solution I've see is using a triva question with an image. Show four images and ask, which one is a football? This too can be beaten by a determined spammer.

gosu

8:47 am on Jan 15, 2009 (gmt 0)

10+ Year Member



What about E-mail confirmations? Noone uses it or what?
If the spammers are forced to open an account before they can publish junk at your site and this requires e-mail confirmation what will happen if the same spammer gets banned? He will need to open new e-mail account, new site registration new confirmation again and eventually he gets banned again just hours after he posted spam I am sure the same guy won't return again, because it won't worth the effort.

grelmar

6:55 am on Jan 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If the spammers are forced to open an account before they can publish junk at your site and this requires e-mail confirmation what will happen if the same spammer gets banned? He will need to open new e-mail account, new site registration new confirmation again and eventually he gets banned again just hours after he posted spam I am sure the same guy won't return again, because it won't worth the effort.

It's actually quite easy to automate the registration process and email confirmation. If you have control of a compromised server, It's also easy to generate new email addresses in an automated manner.

gosu

9:04 am on Jan 16, 2009 (gmt 0)

10+ Year Member



grelmar, and that is why my system supports ban for E-mails or E-mail server :)))
Anyway banning is time consuming operation, better count on the other methods and use ban if they fail

incrediBILL

11:17 am on Jan 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you have control of a compromised server

Some of us block access to our sites from data centers to avoid being abused by compromised servers, so that method won't work with everyone, just those with low security.

whoisgregg

1:35 am on Jan 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



block access to our sites from data centers

Wow, for some reason that never occurred to me. I wouldn't have described myself as having "low security" either. I guess this just shows there's always new tricks to learn. :)

I've already done a bit of searching but all the SERPs are polluted with information about search engine data centers. Could you point me to a list of data center IPs? Or, if that's not how you go about it, could you share your methodology for determining if an IP is in a data center?

Thanks, incrediBILL!

This 72 message thread spans 3 pages: 72