homepage Welcome to WebmasterWorld Guest from 54.211.68.132
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

This 35 message thread spans 2 pages: < < 35 ( 1 [2]     
vBulletin Issues Warning that reCAPTCHA Cracked
Brett_Tabke




msg:4251669
 5:47 am on Jan 11, 2011 (gmt 0)

It has become apparent from our customers and customers of other BB Systems that there is a targeted effort being made to spam forums world-wide. Unfortunately as part of that effort it appears that ReCaptcha may have been cracked as per this page:
[vbulletin.com...]



Despite denials from Google, a security researcher continues to assert that the Search King’s reCAPTCHA system for protecting Web sites from spammers can be successfully exploited by Internet junk mail panderers.

Researcher Jonathan Wilkins published a paper recently that included an analysis of reCAPTCHA’s security. In automated attacks he conducted against the system, he reported he had an alarming success rate of 17.5 percent.
[allspammedup.com...]


Unlike most CAPTCHA systems, Google’s uses images with two words. That’s because Google uses reCAPTCHA for two purposes. Like other CAPTCHA systems, it’s designed to frustrate spammers, but it’s also incorporated into Google’s efforts to digitize books. When a word in a book scan can’t be recognized by Google’s OCR software, it’s sent to the reCAPTCHA pool. So when a person enters a reCAPTCHA phrase into a form, Google can discover what its OCR program couldn’t, without having to hire human editors to review scanning results.

 

Brett_Tabke




msg:4252377
 2:42 pm on Jan 12, 2011 (gmt 0)

> how can the challange be compared to the response?

Fuzzy logic:
- reCaptcha only uses real words in a dictionary
- google compares the answer to a real word.
- They use two words and only 1 must match most times.
- One of the words is known to have a very high percentage of correct human answers (easy to read)
- One of the words is not known. It is there to track answers.
- Any match can be off by 1 character.
- OCR comparisons are often exchanges of characters (1 for l, or o for 0, or I for l). You can compare them and assume that 1ike is actually like.
- Semantical comparison. Syntax analysis would be a good verification. Plug the word back into the sentence it came from and see if it is accurate 1.

After the word is shown to XX number of people, a comparison of the answers is done. If 91% think it the word is Y and the rest of the answers don't match or are ambiguous, then it is a safe bet that the word is Y and that it is a real word. If they get a bad word - then they can send it to a human editor for editing.

1 aside: think about all the text that the Google machine has seen through the book scanning system. Think about how that could be poured back into - oh say - a search engine. Semantic analysis, quality of verbiage, human vs machine generated....whew. that will bake your noodle for awhile.

Gibble




msg:4252387
 3:01 pm on Jan 12, 2011 (gmt 0)

Hey, if Google's OCR software cant read it, and we type it in during the verification process, how can the challange be compared to the response?


That's why there are two words. One they know the answer too, and that is the one that is required to get through. The other, they don't, and that is the one the ocr can't solve. Once enough people type the same thing in as an answer, the letters of that become known.

civgroup




msg:4252554
 7:40 pm on Jan 12, 2011 (gmt 0)

I can confirm this as well. For the past almost two weeks we have been inundated with russian spam posts on our message board. Thankfully we have another security layer to combat it so that the spammers are unaware that their attempts are still futile, but the recaptcha layer has been breached.

Tri0n




msg:4257541
 9:08 pm on Jan 24, 2011 (gmt 0)

Hands Down Best Way To Get Rid Of It All! Forget the small fish (not entirely yet) and go for the Big Fish. It's the big fish that hire these people to spam, so make them pay for it!

Fines should be handed out for every link to the target address owner. So they have to pay for every link (X) number of times listed in a mail (X) number of mail sent.

If target owner is UNKNOWN or hidden by a DOMAIN Host that is hiding their identity then the Fines should be put upon the DOMAIN Host that is hosting and hiding their clients to allow them to do this.

In one day that can be millions of dollars in fines. I doubt any of these idiots company's will be paying for people to spam their junk any longer if they can even remain in business after being fined by Federal Trade Commissions or other relevant supervising board in the country that hosts the website.

Currently I've been seeing as of today, 2 SPAM hits got through my reCaptcha and all the link etc in this were all bogus links to sites that don't exist. What that tells me is that this is the developer or guinne pig testing a new reCaptcha crack.


Simple Solution: Kill The Source!

Status_203




msg:4257758
 9:28 am on Jan 25, 2011 (gmt 0)

Hands Down Best Way To Get Rid Of It All! Forget the small fish (not entirely yet) and go for the Big Fish. It's the big fish that hire these people to spam, so make them pay for it!

Fines should be handed out for every link to the target address owner. So they have to pay for every link (X) number of times listed in a mail (X) number of mail sent.


Then big fish will pay for links to little fish and get little fish bankrupted by the fines.

(and then possibly take over the newly promoted domain as well)

This 35 message thread spans 2 pages: < < 35 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved