homepage Welcome to WebmasterWorld Guest from 54.167.238.60
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Content, Writing and Copyright
Forum Library, Charter, Moderators: not2easy

Content, Writing and Copyright Forum

    
Converting old magazines to digital format and protecting them
motorhaven




msg:4546603
 9:57 pm on Feb 18, 2013 (gmt 0)

I have a client who publishes a print magazine. They want to move everything to a digital format, while still producing the printed version. Subscribers to the magazine will have access to back issues online for free.

There are two parts to this project I have to tackle with them:

1. Converting old print copies to a digital format. Some of the back issues exist in print only. What's the best method for getting these into some sort of industry standard format (PDF or whatever) while keeping the text intact text instead of images (assuming some sort of built in OCR)?

2. Protecting the content behind a web login is a no brainer. However, we also need to protect the content files from being pirated, and this includes copy/pasting from the content viewed inside a browser as well.

How do I proceed? Anyone here have any experience with digital rights management who can give some insight?

FYI... I am not hired to do this project in the sense that I will not be scanning and converting all this content myself. I am a contractor who handles all their web server management and its content, and they want my recommendations and guidance so far as what direction to take, what software or services to we should use, etc.

 

g1smd




msg:4546610
 10:40 pm on Feb 18, 2013 (gmt 0)

OCR still isn't 100% reliable. If you want to do a decent job, devote some man hours to proof reading it and fixing problems.

I've recently been reading some 1960s journals redone as PDF and it was painful to say the least.

motorhaven




msg:4546618
 11:03 pm on Feb 18, 2013 (gmt 0)

Anything OCR'd will be proof-read and corrected, just not by me.

Leosghost




msg:4546620
 11:31 pm on Feb 18, 2013 (gmt 0)

DRM that defeats print screen* / copy paste etc is expensive..basically involves either java applets ..or special browsers ( for PDFs )..or specialised ebook wrappers ..

*Very few actually defeat "print screen"..even though many claim to do so..and all are platform ( OS ) dependent ..ie. they are only available and effective on windows boxen..

motorhaven




msg:4546624
 11:47 pm on Feb 18, 2013 (gmt 0)

My client doesn't have a problem with forking out the bucks for an expensive application, especially since at once point they were looking at $3000 per issue (quote from one company).

They need something less expensive than that, so forking out thousands for a software solution isn't a problem, so long as it saves them money on the per issue basis.

The proof reading of OCR'd text, we already have that lined up fairly inexpensively. Big issue here is what platform this should be OCR'd into and how to deliver it.

Also, client said up front they realize nothing is 100% going to stop copying, they simply want to stop the common copying to minimize it's appearance elsewhere on the web and also casual sharing it with others.

I should have noted this additional information in my first post. Apologies... :-)

Leosghost




msg:4546630
 12:20 am on Feb 19, 2013 (gmt 0)

Ok ..so you are looking at 3 which I know first hand that work ( there may be others / more, but these are the ones that I know defeat printscreen )..

Copysafe..
or locklizard..
or..ebookeditpro..( I run an old version of the ebookeditpro..works well for CHM type ebooks..stops print screen on all win versions..but I do not know if it still is available..their site has said "update in progress" forever.. ) ..I have heard that there is no "support" now..and that some buyers in the last two or three years have had problems either reaching "support" or receiving product paid for..YMMV

I also run some copysafe products..they work well ( they use their own browser for "secure PDFs"..the PDF reader browser is free and can be auto bundled with your secure PDFs made using the paid for product ) ..This would be my suggestion ..

Locklizard is waaay more money..I've used it ..works.. ..may well be overkill / overspend for what you need though..

I have no association with any of them..

HTH :)

lucy24




msg:4546639
 1:16 am on Feb 19, 2013 (gmt 0)

Big issue here is what platform this should be OCR'd into

Doesn't have to be platform-specific. Proofreading can even be done online. Plain text (.txt) is also platform-independent. Just make sure everyone is using the same file encoding or you will end up in a horrendous mess.

The proof reading of OCR'd text, we already have that lined up fairly inexpensively.

:: peering into crystal ball ::

Filipino readers who are accustomed to Roman script but don't know English, so they will be matching character-for-character without the brain silently correcting typos and/or scannos.

g1smd




msg:4546640
 1:28 am on Feb 19, 2013 (gmt 0)

One journal that I read from time to time, always has to issue a couple of corrections and notes in the following paper issue.

In the electronic version, someone takes the time either to apply those corrections or at the very least insert the note about what is wrong, right next to where the error is.

motorhaven




msg:4546776
 2:58 pm on Feb 19, 2013 (gmt 0)

Doesn't have to be platform-specific. Proofreading can even be done online. Plain text (.txt) is also platform-independent. Just make sure everyone is using the same file encoding or you will end up in a horrendous mess.


Since its a magazine we need a package which can scan the documents into an editable form while leaving the images intact and in the same page location as the original. Plain text files won't do.


:: peering into crystal ball ::

Filipino readers who are accustomed to Roman script but don't know English, so they will be matching character-for-character without the brain silently correcting typos and/or scannos.


That's a poor assumption. The proof-reading is being handled by some local USA based college students.

lucy24




msg:4546928
 10:51 pm on Feb 19, 2013 (gmt 0)

Since its a magazine we need a package which can scan the documents into an editable form while leaving the images intact and in the same page location as the original.

And...? Look at some of the existing web-based proofreading sites like runeberg.

The proof-reading is being handled by some local USA based college students.

You'd have done better with non-English speakers. (Appearances to the contrary, US college students do not qualify.) Native speakers are the most likely not to notice simple errors like line-break duplication, or standard scannos like u:n or o:e:c.

wilderness




msg:4587105
 12:26 pm on Jun 24, 2013 (gmt 0)

mh,
Four months have passed since this thread was active.
Do you have a solution in place?

motorhaven




msg:4587127
 1:34 pm on Jun 24, 2013 (gmt 0)

They decided to use the issuu web site for their existing digital files for the magazine. For the previous non-digital versions rather than OCR they plan to simply scan them in and present them as images behind a user login.

It's certainly not optimal imho because I believe rights management isn't there - just a simple login, if the user is sophisticated there are tools to download the SWF to their local computer and then they can republish it anywhere. But they said they haven't seen any republishing of their stuff in the several years they've used it for other items. One of the things on my plate is to setup an account for them with a duplicate content service to try to keep an eye on things after the transition. Not the best, but it's what will fit in the budget.

What basically happened is the more information I gave them the more their faces fell as they realized there is no cheap way to get so much content online (nearly 20 years worth) in a very secure environment.

wilderness




msg:4587276
 9:46 pm on Jun 24, 2013 (gmt 0)

there is no cheap way to get so much content online (nearly 20 years worth) in a very secure environment.


I've been digitizing previously published widget magazines for nearly fifteen years.

I do have some key articles online.

Unfortunately with 24,000+ articles (OCR'd) and 30,000+ images I've not found a secure method either.

Expiring PDF's (despite their large size and time creation expense) of combined images and OCR seem to be the only alternative for restrictions.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Content, Writing and Copyright
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved