| 7:59 am on Jan 5, 2009 (gmt 0)|
With a trivia captcha. Don't use an image, because they're not accessible and they're annoying for your users. You simply need to add a unique question to your forms, and check for the correct answer.
| 1:11 pm on Jan 5, 2009 (gmt 0)|
Thanks! That was the idea I liked best. I know how to add a trivia question to my form and I know to keep it simple so that every 'real' person can answer it; I just don't know how to make my form 'kick out' any email that doesn't answer the question correctly. It's easy for me to tell the difference between spam and real form submissions, but I'm extremely tired of having to open up 100 emails a day to find the one or two that are real. I don't understand how to do this on the server side of things I guess.
| 2:05 pm on Jan 5, 2009 (gmt 0)|
Or require registration by your users.
| 4:37 pm on Jan 5, 2009 (gmt 0)|
These can be gotten around quite easily. Server-side scripting is going to be needed if you want to implement a solution with a high chance of keeping spam out.
| 4:46 pm on Jan 5, 2009 (gmt 0)|
|These can be gotten around quite easily. |
I suppose anything can be circumvented easily depending on the implementation.
Requiring registration with a valid e-mail has helped us squash any spam-bots.... but we are now left with human-spammers... and that takes a good set of moderators to combat... ;-)
| 10:59 pm on Jan 5, 2009 (gmt 0)|
I manage a vBulletin board with a captchka. It was broken quite easily. That I stopped by adding, and regularly changing, the "custom registration field" (what is one plus six?") I left it at that because this is someone else's programming and am not really interested in hacking it up, as it would all be overwritten with the next upgrade.
These are all temporary measures, ones that seem to make it stop for a time . . . but they keep coming back, They're relentless, because they're robot programs.
The only sure fire way is to get your hands dirty, dig into and learn the programming, log all data, learn what they are doing. Filter out anything but expected data, and then on that expected data, hit them where their money comes from. Look at the nature of the spam: it's very likely mostly link drops or specific words (you know, little blue pills or teen girls . . . ) On your filtered data, filter yet again.
Eventually they'll stop because you'll be more trouble than they're worth.
form abuse thread [webmasterworld.com]
| 3:19 pm on Jan 6, 2009 (gmt 0)|
|That I stopped by adding, and regularly changing, the "custom registration field" |
I have asked my programmer about a solution for this issue. He wants to keep trying to implement a database of questions and have them randomly placed on the form each time the page loads. When someone gets the form they have to answer the question which will be validated against the database of answers. Personally I think this is too complicated.
Would it work to simply use the same question, i.e. 3+5=8 and use that every time? It would seem that a spam bot would be foiled by this because it requires it add it together and put in the right answer. If it was the same question in a hidden field for all of my forms that should be easy to implement, but would it work?
| 5:45 pm on Jan 6, 2009 (gmt 0)|
Well the bots will either learn, or manually be altered, to "get" the right answer. When the bots began using this board to test their captchka-breaker, I added the trivia Q and they were back at it in a week. So either the programming adjusts automatically, or the spammer figures out it's not working, investigates, and changes the values of their bot input for that field.
By changing it frequently they eventually give up because it's too much trouble, like anything in life, be a pain in arse and the problem will go away. :-)
So I think your programmer has good idea, but it doesn't need to be complicated. Just centralize the page where you enter the question and correct answer, have all your form processors point to that resource, or do a random against a database like he suggested.
All of this, again, is really a moot point if you get at the root of the problem: filter the data, remove the motivation. Random trivia fields, captchkas, random field names, empty hidden fields - these are all "road blocks" but they don't address the core issue, removing the elements that provide what the spammers hope to achieve.
If there's a chance they can get their links or pron phrases through, they will keep trying. Once you take away the candy, they won't want it any more.
| 6:53 pm on Jan 7, 2009 (gmt 0)|
|If there's a chance they can get their links or pron phrases through, they will keep trying. Once you take away the candy, they won't want it any more. |
However what about non-spammers that may want to send you a URL link? I have forms on my site that are targeted to businesses and I would legitimately ask for the company URL to their web site. How do I stop the spammer with a filter, but let the legit guy through?
| 8:46 pm on Jan 7, 2009 (gmt 0)|
I agree with the part about using a question and not some stupid image. Nothing people hate more than trying to figure out an image and having to keep trying because it's too damn scrabled for them to read.
| 10:02 pm on Jan 7, 2009 (gmt 0)|
|I agree with the part about using a question and not some stupid image. Nothing people hate more than trying to figure out an image and having to keep trying because it's too damn scrabled for them to read. |
Totally agree. I can't stand the warped letters that you can barely read. I end up leaving the site after I fail to enter the stupid sequence correctly 2-3 times in a row... it is just not worth it. And I don't go back to those sites...
| 4:21 pm on Jan 8, 2009 (gmt 0)|
|However what about non-spammers that may want to send you a URL link?....How do I stop the spammer with a filter, but let the legit guy through? |
A little more difficult, but not as bad. :-) Suffice it to say that depending on how you set up your forms, the nature of a legitimate links is very different from a bot link drop.
[edited by: phranque at 9:54 pm (utc) on Jan. 9, 2009]
[edit reason] edited at author's request [/edit]
| 5:25 pm on Jan 8, 2009 (gmt 0)|
I favour filtering. The only spam my contact forms attract include link=" and/or href=". A simple regular expression in a configuration file and it's gone.
| 6:27 pm on Jan 8, 2009 (gmt 0)|
If you elect captcha as your solution, there are several good hosted captcha services out there. Let them worry about maintaining a database of what works and what doesn't. A few of them are free.
| 6:28 pm on Jan 8, 2009 (gmt 0)|
Hidden fields work like a charm.
A hidden field is added to the contact form, and while processing the form the script needs to check to see if the hidden field is filled out during the submission process. If it is, then you have a bot that's filling out the form and you can junk that submission.
| 6:57 pm on Jan 8, 2009 (gmt 0)|
Most bots don't load the form page each time, they just load it once, parse the field names and destination URL then access that directly. So another solution is to create a one-time use key when the user accesses the form page that you pass in a hidden form field to the processing script.
| 7:24 pm on Jan 8, 2009 (gmt 0)|
All my forms are uses a technique simular to this:
email <input name="email">
<input type="hidden" name="human">
(this is an over simplified example)
I get the server side script to ignore the form if the hidden field is still blank.
This method works really well for me.
If you still get spam, then change "y" with different random numbers/characters and test that they get returned - This works 100% for me.
[edited by: Seb7 at 7:26 pm (utc) on Jan. 8, 2009]
| 8:54 pm on Jan 8, 2009 (gmt 0)|
oops, just noticed a slight error
email <input name="email">
<input type="hidden" name="human">
| 8:59 pm on Jan 8, 2009 (gmt 0)|
CAPTCHAs work a bit.
Banning certain countries where you only get spam from helps too, but I'm not advocating discrimination based on geolocation, there is way too many of that already. Still if it's me against a sweatshop full of people spamming manually I'm banning as that's the only thing that'll stop them.
What also works is to detect html links and/or BBcode [URL] and dropping it in the bit bucket on the back-end (making the front-end seem to work works best as they can't figure out exactly what you dislike. Similarly popular misspellings of medication etc. can be added as blacklisted words that silently drop the message instead of processing it.
And oh, yes the higher the PR in the toolbar the more of that crap you get and the more persistent they are.
| 8:59 pm on Jan 8, 2009 (gmt 0)|
juse send an additional variable with your form. If its content doesnt match what you want there, you dont send the e-mail.
| 9:04 pm on Jan 8, 2009 (gmt 0)|
The following changes typically eliminate form spam.
2. Use JS to verify an actual human typed at the keyboard with keyboard events which create a checksum of typed data added to a hidden field.
1. Validate the data checksum sent via the hidden form field.
2. Reject all submissions using GET, accept only POST submissions
3. Accept only submissions where the REFER is your domain name, reject all others
4. Only accept submissions from valid browser user agents Opera, FF, MSIE, etc., rejecting things from curl, java, perl, etc.
[edited by: incrediBILL at 9:08 pm (utc) on Jan. 8, 2009]
| 9:18 pm on Jan 8, 2009 (gmt 0)|
I've always found a combination of things works best, however a well done CAPTCHA is still nearly foolproof all by itself.
Hidden fields are cheap and easy, so why not.
Blacklists are annoying to maintain but they are one of the only things that works against manual spam.
Note that mass distributed CAPTCHA solutions are nearly useless (i.e. one that comes with popular software) you need one that's custom.
|Don't use an image, because they're not accessible and they're annoying for your users. |
This isn't exactly true. Many people provide an audio alternative when using image CAPTCHAs.
|Totally agree. I can't stand the warped letters that you can barely read. |
Warped letters are not the only form of alphanumeric CAPTCHA. In fact considering their popularity they are probably the wrong choice. There are plenty of other ways to obfuscate without making the letters unreadable.
If you're going to use a trivia CAPTCHA, use a large number of questions, as mentioned, bot users most definitely do take the time to input custom answers for individual sites.
CAPTCHA hate is a testimony to it's effectiveness. Nothing else works well enough to be as annoyingly ubiquitous :-)
[edited by: IanKelley at 9:20 pm (utc) on Jan. 8, 2009]
| 9:35 pm on Jan 8, 2009 (gmt 0)|
Most web sites already use JS for menus and everything else so using it to monitor keyboard activity is par for the course and when done properly, is randomized so that only the current session uses the current solution therefore next to impossible to circumvent.
| 9:45 pm on Jan 8, 2009 (gmt 0)|
Some people still turn JS off, and some mobile users still don't have it... although yes it's getting less and less feasible to go without JS.
You're leaving out the second part... It's not as difficult as people would like to think to write a bot that can read JS. Somehow JS reading bots have yet to become common but they most certainly will at some point.
| 11:08 pm on Jan 8, 2009 (gmt 0)|
|Some people still turn JS off |
Can't use most sites navigation, nor flash, can't worry about those luddites.
|some mobile users still don't have it |
You can allow them based on the type of device, using the UA and HTTP headers to determine it's a mobile device.
The devil is in the details.
| 11:23 pm on Jan 8, 2009 (gmt 0)|
|You can allow them based on the type of device, using the UA and HTTP headers to determine it's a mobile device. |
If I was running spam bots, reading this would have been an epiphany.
| 11:24 pm on Jan 8, 2009 (gmt 0)|
Well it seems that every time we arrive to this subject, once every 3 month or so(whether it makes it to home page thread on not) we usually end up in disagreement on the bases of Simple Plain Innocent Users that run their Browsers with JS Disabled and Cookies Turned Off.
Yes They are important, yes the customer is always right, but in reality the number of Guest Book Posts(I mean Valid Guest book posts when you run a Legit Business and do everything you must to make the Shoe Thrower (recently the Plumber)) VS. Scum happy, what is the number? Take a big GIG for example, WebmasterWorld, yes, HERE. No Cookie = No Milk, Sorry, mate the COW past away a week ago!
Then comes the tide that Screams, I have to except the cookie to post my bellowed opinion?, hmmmm, mean while to watch FOX or SCIFI on the net, you are required to expert to allow Doubleclik.net Cookie + Ads, one of them, don’t remember which one to be exact but you get my point, right?... think about it…
We are all aware of the bots and the bots are aware of the sites that they could target, you got a DOODOO = they know about it... Look at your logs, look at your visitors’ behavior, if you have content posted to your Guest Book that is negative maybe you should shot down the site, and if it is positive learn from it.
Don’t have one, never did, but the fake ones to thwart the scraper bots and spammers, I am sure of the community in your web neighborhoods that has issues that surfaced so far….
If you come to my store, the only thing that has alcohol in it is Eggnog, that is as much boose as you could get. Bots pounding Captcha image should be logged, more that 3 attempts to access the file per IP is not cool. If you have more than 50 IPS trying to pound your Captcha with in the short period of time, you have your self a much bigger problem, such as DDOS. Learn from your visitors, make adjustments for the site.
Not Logging is Futile…
| 11:36 pm on Jan 8, 2009 (gmt 0)|
|Take a big GIG for example, WebmasterWorld, yes, HERE. No Cookie = No Milk, Sorry, mate the COW past away a week ago! |
Funny, but also true and reasonable (along with the "Friends of Ned" related statement by incrediBILL).
Accommodations on the net for variety (e.g., browsers, devices, display variants, etc) is one thing, but there comes a time where certain behavior patterns are going to equal a far more limited Internet, as no one will (or can functionally) accommodate their behavior.
| 12:22 am on Jan 9, 2009 (gmt 0)|
|If I was running spam bots, reading this would have been an epiphany |
No, because I didn't tell you what I look for whatsoever.
Many mobile devices use a mobile service so you can easily exclude most mobile devices that come from generic services to avoid spoofing.
More details and the devil is in there...
| This 72 message thread spans 3 pages: 72 (  2 3 ) > > |