Forum Moderators: phranque
These can be gotten around quite easily. Server-side scripting is going to be needed if you want to implement a solution with a high chance of keeping spam out.
These can be gotten around quite easily.
I suppose anything can be circumvented easily depending on the implementation.
Requiring registration with a valid e-mail has helped us squash any spam-bots.... but we are now left with human-spammers... and that takes a good set of moderators to combat... ;-)
These are all temporary measures, ones that seem to make it stop for a time . . . but they keep coming back, They're relentless, because they're robot programs.
The only sure fire way is to get your hands dirty, dig into and learn the programming, log all data, learn what they are doing. Filter out anything but expected data, and then on that expected data, hit them where their money comes from. Look at the nature of the spam: it's very likely mostly link drops or specific words (you know, little blue pills or teen girls . . . ) On your filtered data, filter yet again.
Eventually they'll stop because you'll be more trouble than they're worth.
form abuse thread [webmasterworld.com]
That I stopped by adding, and regularly changing, the "custom registration field"
I have asked my programmer about a solution for this issue. He wants to keep trying to implement a database of questions and have them randomly placed on the form each time the page loads. When someone gets the form they have to answer the question which will be validated against the database of answers. Personally I think this is too complicated.
Would it work to simply use the same question, i.e. 3+5=8 and use that every time? It would seem that a spam bot would be foiled by this because it requires it add it together and put in the right answer. If it was the same question in a hidden field for all of my forms that should be easy to implement, but would it work?
By changing it frequently they eventually give up because it's too much trouble, like anything in life, be a pain in arse and the problem will go away. :-)
So I think your programmer has good idea, but it doesn't need to be complicated. Just centralize the page where you enter the question and correct answer, have all your form processors point to that resource, or do a random against a database like he suggested.
All of this, again, is really a moot point if you get at the root of the problem: filter the data, remove the motivation. Random trivia fields, captchkas, random field names, empty hidden fields - these are all "road blocks" but they don't address the core issue, removing the elements that provide what the spammers hope to achieve.
If there's a chance they can get their links or pron phrases through, they will keep trying. Once you take away the candy, they won't want it any more.
If there's a chance they can get their links or pron phrases through, they will keep trying. Once you take away the candy, they won't want it any more.
However what about non-spammers that may want to send you a URL link? I have forms on my site that are targeted to businesses and I would legitimately ask for the company URL to their web site. How do I stop the spammer with a filter, but let the legit guy through?
I agree with the part about using a question and not some stupid image. Nothing people hate more than trying to figure out an image and having to keep trying because it's too damn scrabled for them to read.
Totally agree. I can't stand the warped letters that you can barely read. I end up leaving the site after I fail to enter the stupid sequence correctly 2-3 times in a row... it is just not worth it. And I don't go back to those sites...
However what about non-spammers that may want to send you a URL link?....How do I stop the spammer with a filter, but let the legit guy through?
A little more difficult, but not as bad. :-) Suffice it to say that depending on how you set up your forms, the nature of a legitimate links is very different from a bot link drop.
[edited by: phranque at 9:54 pm (utc) on Jan. 9, 2009]
[edit reason] edited at author's request [/edit]
All my forms are uses a technique simular to this:
<form name="form">
email <input name="email">
<input type="hidden" name="human">
</form>
<script>
form.human="y"
</script>
(this is an over simplified example)
I get the server side script to ignore the form if the hidden field is still blank.
This method works really well for me.
If you still get spam, then change "y" with different random numbers/characters and test that they get returned - This works 100% for me.
[edited by: Seb7 at 7:26 pm (utc) on Jan. 8, 2009]
Banning certain countries where you only get spam from helps too, but I'm not advocating discrimination based on geolocation, there is way too many of that already. Still if it's me against a sweatshop full of people spamming manually I'm banning as that's the only thing that'll stop them.
Javascript: since I myself surf without javascript turned on I'm not fond of that, and it too can be beaten by human spammers.
What also works is to detect html links and/or BBcode [URL] and dropping it in the bit bucket on the back-end (making the front-end seem to work works best as they can't figure out exactly what you dislike. Similarly popular misspellings of medication etc. can be added as blacklisted words that silently drop the message instead of processing it.
And oh, yes the higher the PR in the toolbar the more of that crap you get and the more persistent they are.
Client Side
1. Bury the entire form in obfuscated javascript [javascriptobfuscator.com] so the form seeking bots can't find it in the first place.
2. Use JS to verify an actual human typed at the keyboard with keyboard events which create a checksum of typed data added to a hidden field.
Server Side
1. Validate the data checksum sent via the hidden form field.
2. Reject all submissions using GET, accept only POST submissions
3. Accept only submissions where the REFER is your domain name, reject all others
4. Only accept submissions from valid browser user agents Opera, FF, MSIE, etc., rejecting things from curl, java, perl, etc.
[edited by: incrediBILL at 9:08 pm (utc) on Jan. 8, 2009]
Javascript is good for some situations but on the other hand why use something that could discriminate against legit users and can be circumvented by bots.
Hidden fields are cheap and easy, so why not.
Blacklists are annoying to maintain but they are one of the only things that works against manual spam.
Note that mass distributed CAPTCHA solutions are nearly useless (i.e. one that comes with popular software) you need one that's custom.
Don't use an image, because they're not accessible and they're annoying for your users.
This isn't exactly true. Many people provide an audio alternative when using image CAPTCHAs.
Totally agree. I can't stand the warped letters that you can barely read.
Warped letters are not the only form of alphanumeric CAPTCHA. In fact considering their popularity they are probably the wrong choice. There are plenty of other ways to obfuscate without making the letters unreadable.
If you're going to use a trivia CAPTCHA, use a large number of questions, as mentioned, bot users most definitely do take the time to input custom answers for individual sites.
CAPTCHA hate is a testimony to it's effectiveness. Nothing else works well enough to be as annoyingly ubiquitous :-)
[edit: typo]
[edited by: IanKelley at 9:20 pm (utc) on Jan. 8, 2009]
Javascript is good for some situations but on the other hand why use something that could discriminate against legit users and can be circumvented by bots.
How does Javascript discriminate against legit users any more than a captcha?
Most web sites already use JS for menus and everything else so using it to monitor keyboard activity is par for the course and when done properly, is randomized so that only the current session uses the current solution therefore next to impossible to circumvent.
You're leaving out the second part... It's not as difficult as people would like to think to write a bot that can read JS. Somehow JS reading bots have yet to become common but they most certainly will at some point.
Yes They are important, yes the customer is always right, but in reality the number of Guest Book Posts(I mean Valid Guest book posts when you run a Legit Business and do everything you must to make the Shoe Thrower (recently the Plumber)) VS. Scum happy, what is the number? Take a big GIG for example, WebmasterWorld, yes, HERE. No Cookie = No Milk, Sorry, mate the COW past away a week ago!
Then comes the tide that Screams, I have to except the cookie to post my bellowed opinion?, hmmmm, mean while to watch FOX or SCIFI on the net, you are required to expert to allow Doubleclik.net Cookie + Ads, one of them, don’t remember which one to be exact but you get my point, right?... think about it…
We are all aware of the bots and the bots are aware of the sites that they could target, you got a DOODOO = they know about it... Look at your logs, look at your visitors’ behavior, if you have content posted to your Guest Book that is negative maybe you should shot down the site, and if it is positive learn from it.
Forums:
Don’t have one, never did, but the fake ones to thwart the scraper bots and spammers, I am sure of the community in your web neighborhoods that has issues that surfaced so far….
If you come to my store, the only thing that has alcohol in it is Eggnog, that is as much boose as you could get. Bots pounding Captcha image should be logged, more that 3 attempts to access the file per IP is not cool. If you have more than 50 IPS trying to pound your Captcha with in the short period of time, you have your self a much bigger problem, such as DDOS. Learn from your visitors, make adjustments for the site.
Not Logging is Futile…
Blend27
Take a big GIG for example, WebmasterWorld, yes, HERE. No Cookie = No Milk, Sorry, mate the COW past away a week ago!
Funny, but also true and reasonable (along with the "Friends of Ned" related statement by incrediBILL).
Accommodations on the net for variety (e.g., browsers, devices, display variants, etc) is one thing, but there comes a time where certain behavior patterns are going to equal a far more limited Internet, as no one will (or can functionally) accommodate their behavior.
-Commerce
If I was running spam bots, reading this would have been an epiphany
No, because I didn't tell you what I look for whatsoever.
Many mobile devices use a mobile service so you can easily exclude most mobile devices that come from generic services to avoid spoofing.
More details and the devil is in there...