Pure Java OCR

Forum Moderators: phranque

Message Too Old, No Replies

Pure Java OCR

Looking for free/cheap pure-Java text extraction from images.

DamonHD

5:15 pm on Sep 2, 2005 (gmt 0)

Hi,

In my multimedia gallery, some of the image exhibits contain text that might be helpful to extract and index for searches (internal and external).

To that end I've been looking for a while for pure-Java OCR package that is not too expensive or slow, but can do a half-reasonable job (does not have to be anything like perfect!) of extracting mainly-English text in a mixture of layouts and orientations from photographs (eg of roadsigns, posters, etc).

Does any such thing exist? I cannot see it if so... There might be an academic/student/masters project that I've missed as I bet this is still something of a research topic!

PLEASE DO NOT drop URLs or product names into the thread or it will be nuked (SM me if you have such details!), but I am interested in general experiences if any.

Rgds

Damon

DamonHD

12:10 pm on Sep 5, 2005 (gmt 0)

Anyone?

Lord Majestic

12:23 pm on Sep 5, 2005 (gmt 0)

Never heard of one written in Java -- its just not the kind of language this kind of complex software is written in. Why does it HAVE to be Java? Have you considered just interfacing with other standalone OCR software?

DamonHD

10:28 am on Sep 6, 2005 (gmt 0)

Hi,

Yes, it more-or-less has to be Java because the site is distributed in a WAR and runs on three different platforms (at once) ie Windows/Intel, Linux/Intel and Solaris/SPARC.

Now a Java interface to equivalent native libraries (eg DLLs, shared objects, etc) for the appropriate platforms would be OK, though less safe and more of a pain to manage.

At a pinch I *could* extract the information just once on the master (Solaris) server, but that would be less good for various reasons.

Performance and accuracy are less critical than safety (ie I can't have one bring the whole site down).

But in any case, thanks for the input so far!

Rgds

Damon

Pure Java OCR

Looking for free/cheap pure-Java text extraction from images.

DamonHD

DamonHD

Lord Majestic

DamonHD

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week