Forum Moderators: phranque
Created by Luis von Ahn at Carnegie Mellon University in Pittsburgh, the Recaptcha project scoops up words that optical character reading software has marked as unreadable by computers.In some documents, where ink has faded and paper has yellowed, the character reading software can flag up to 20% of words as indecipherable.
The hard-to-read words are then farmed out to the many thousands of sites that have signed up to be Recaptcha partners.
The worst captcha? Google. I can never get them right.
I'd like to see captcha systems that accomplish other simple turingesque tasks as well.
I think recaptcha solves the problem of digitizing books better than it solves the problem of dealing with spam.
Question:
where are the books that are being digitized? Who is getting all this free labour?
Answer:
has just helped to complete the conversion of the entire archive of the New York Times from 1908 into digital form.
That's what they've finished though. Who knows what they're doing now.
By the way, there are off-the-shelf plugins for several popular CMS. If you're using CMS or blog software, there's a good chance there's a recaptcha plugin. To find out if one exists for your platform, see
[recaptcha.net...]
I think it's because they just published the paper in Science (one of the two leading journals in the sciences, the other being Nature). Typically when something with popular appeal (spam, books) makes it into Science or Nature, there's a flurry of media attention on the topic.
The article was accepted by Science on Aug 5 and published on Aug. 14. The sudden new media attention dates from then.
I think it works great, no, it's not case sensitive, and if the word is illegible it can be omitted.
yeah, I got that so far ... but I wasn't able to make out either word, multiple times in a row. Don't know, maybe it was broken at that time, fired off an angry email to mozilla for using such a load of crap, but as you could expect, no reply.
The worst captcha? Google. I can never get them right.
funny, that's one of the few I really find bearable. They're usually set up like words (aka not five vocals in a row) and readability is ok.
Then think again if you'd want these kind of locks beeing attached to millions of websites.
Although recaptcha can be a great inconvenience to the vision-impaired, they do receive a benefit from its widespread adoption. Vast amounts of historical material (e.g., old newspapers) is becoming available for the first time to the vision impaired, as humans decipher the words that computers can't read.
Also, webwitch asked who is getting their stuff translated, right now that's the Internet Archive (aka the Wayback Machine) and the New York Times.
But recaptcha says it's "accessible to blind users" so it must be so.
Based on my tests, though, those blind users must be amazing, because I closed my eyes and tried to decipher their "blind accessible" puzzle and I have no frickin idea what to do. You have to be kidding. It is so unfriendly to the visually impaired it's crazy.
I have a feeling though that if you play it backwards you get satanic messages;-)
- we had users complain that they didn't know what to do, even though the captcha had instructions right above it, saying "type the words below into this box"
- some other users complained that they were just confused by it, meaning they wondered if the words had something to do with their account
- others did not know whether they should type a space between the words (actually, it doesn't matter, you can omit the space and recaptcha still works)
- some people presented with recaptcha were overcome by "form fear" - the fear that what they've entered is not correct, and that they will be penalized in some horrid permanent way, like their account application will be refused and they'll have to wait 2 weeks before trying again.
As mentioned here and in other threads, there are better ways to deter bots than presenting the user with a Turing test. However the scenarios described above were only a problem for a very small percentage of only the most naive users. We continue to use recaptcha because most people know what it is, and it offers decent protection against automated account creation.
However the scenarios described above were only a problem for a very small percentage of only the most naive users.
OK, I got hung up on it and I'm far from naive and far from blind...
Can you say "JUST COULDN'T READ IT"?
After 5 or 6 tries eventually got through but I wouldn't have tried that hard for something less interesting. I'm thinking recaptcha will probably stop me in the future as very little is worth that level of frustration.
We continue to use recaptcha because most people know what it is, and it offers decent protection against automated account creation.
Most people know short captcha, first time I've ever seen the longer version, it's WAY more annoying, not worth the struggle to get past it.
You can put up random pictures of dogs and cats with a drop down list "WHAT'S IN THE PICTURE" with the words "DOG", "CAT", "TREE", "HOUSE", etc. and it's still a captcha but completely user friendly, except to the vision impaired obviously, and we don't have to strain to read that garbage.
You can put up random pictures of dogs and cats with a drop down list "WHAT'S IN THE PICTURE" with the words "DOG", "CAT", "TREE", "HOUSE", etc. and it's still a captcha but completely user friendly, except to the vision impaired obviously, and we don't have to strain to read that garbage.
Actually Bill, why would you have a drop down list? Why not just use paired photos. In other words, two different photos of different dogs, two different photos of different cats.
In lieu of a submit button, you have four pictures. Click the right one and you're in. Of course, the bots have a 1/4 chance of getting it right too so it may not help much. Still, with other measures it might get rid of the most badness (i.e. if you log IPs on failed submits and then those IPs get put in an approval queue).
In any case, for actual spam control, I've had best luck with proof of work systems [en.wikipedia.org]. Generally, Javascript is required and if the user has JS off you have to present a CAPTCHA, but for those with JS on, no inconvenience. And it seems to stop the comment/reg spammers pretty well for now.
This ridiculous captcha was made even better because everything was horribly distorted like so many captchas are, which made identifying the minute differences between cat drawings impossible because you couldn't even see the cats sometimes.
I easily make errors 50% of the time because I can't actually make them both out, and it's a real aggravation.
I can see myself going to a lot of sites LESS if this takes off.
sounds alot like rapidshare or similar. ran into that one, too, but in my case it was still "little dogs and cats" sitting somewhere on the letters. no way to distinguish them...