I read somewhere that HTTP_REFERER is not 100% reliable. I guess my solution isn't either, but it works very well for our purposes. Here's what I do...
1. Leave the action attribute of the form blank in the html <form> tag. Then when the form is submitted, use javascript to populate the action attribute. jQuery makes this easy.
<form id="nospam" action=""></form>
This goes a long way to thwart crawlers looking for form processing scripts. It won't stop a curious spammer willing to look at your source code. The jQuery to handle this might look something like this...
// This jQuery
$(document).ready(function () {
$('form#nospam').submit(function () {
$(this).attr('action','/path/to/form/processing/script');
return true;
});
});
The drawback here is that if a user doesn't have javascript enabled, they can't submit the form. Also, this usually only helps when implemented ~before~ a form is live. If spam bots already have the URL of your form processor, you'll need more protection. Read on...
2. Use PHP to generate a hash on the page where the form is, include it as a hidden form element, then check it when the script is submitted. For example, if my form is on form.php and the processor is check.php I would do this...
// Before the form is output on form.php
session_start();
$hash = md5(date(str_shuffle('aAbBCcDdEeFf...')));
$_SESSION['form_hash'][md5('/path/to/check.php')] = $hash;
Replace '/path/to/check.php' with what will be reflected by $_SERVER['REQUEST_URI'] when the form is submitted to the processing script. Now, insert this hash into the form as a hidden field.
<input type="hidden" name="hash" value="<?php echo $hash; ?>" />
When the form is submitted, check the hash against what you generated...
session_start();
$hash = $_SESSION['form_hash'][md5($_SERVER['REQUEST_URI'])];
// You MUST unset the hash so that they only get one try
unset($_SESSION['form_hash'][md5($_SERVER['REQUEST_URI'])]);
if($hash === $_POST['hash']) {
//Process the submission...
}
else {
//Send them somewhere else
}
This approach stops spam robots from remotely posting to your form processor because they didn't hit the form first to get the hash stored in a session variable. Whoever wants to submit your form must actually visit the page first. This approach does not stop someone who loads the form and manually submits spam.
I have had several clients come in having issues with automated spam posts. I usually implement both. I change the URL of the form processor then use a blank action attribute, then implement the URL-based hash to make sure that the user actually hits the form before submitting it.