Welcome to WebmasterWorld Guest from 54.161.110.186

Forum Moderators: not2easy

Message Too Old, No Replies

Converting old magazines to digital format and protecting them

     
9:57 pm on Feb 18, 2013 (gmt 0)

10+ Year Member



I have a client who publishes a print magazine. They want to move everything to a digital format, while still producing the printed version. Subscribers to the magazine will have access to back issues online for free.

There are two parts to this project I have to tackle with them:

1. Converting old print copies to a digital format. Some of the back issues exist in print only. What's the best method for getting these into some sort of industry standard format (PDF or whatever) while keeping the text intact text instead of images (assuming some sort of built in OCR)?

2. Protecting the content behind a web login is a no brainer. However, we also need to protect the content files from being pirated, and this includes copy/pasting from the content viewed inside a browser as well.

How do I proceed? Anyone here have any experience with digital rights management who can give some insight?

FYI... I am not hired to do this project in the sense that I will not be scanning and converting all this content myself. I am a contractor who handles all their web server management and its content, and they want my recommendations and guidance so far as what direction to take, what software or services to we should use, etc.
10:40 pm on Feb 18, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



OCR still isn't 100% reliable. If you want to do a decent job, devote some man hours to proof reading it and fixing problems.

I've recently been reading some 1960s journals redone as PDF and it was painful to say the least.
11:03 pm on Feb 18, 2013 (gmt 0)

10+ Year Member



Anything OCR'd will be proof-read and corrected, just not by me.
11:31 pm on Feb 18, 2013 (gmt 0)

WebmasterWorld Senior Member leosghost is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



DRM that defeats print screen* / copy paste etc is expensive..basically involves either java applets ..or special browsers ( for PDFs )..or specialised ebook wrappers ..

*Very few actually defeat "print screen"..even though many claim to do so..and all are platform ( OS ) dependent ..ie. they are only available and effective on windows boxen..
11:47 pm on Feb 18, 2013 (gmt 0)

10+ Year Member



My client doesn't have a problem with forking out the bucks for an expensive application, especially since at once point they were looking at $3000 per issue (quote from one company).

They need something less expensive than that, so forking out thousands for a software solution isn't a problem, so long as it saves them money on the per issue basis.

The proof reading of OCR'd text, we already have that lined up fairly inexpensively. Big issue here is what platform this should be OCR'd into and how to deliver it.

Also, client said up front they realize nothing is 100% going to stop copying, they simply want to stop the common copying to minimize it's appearance elsewhere on the web and also casual sharing it with others.

I should have noted this additional information in my first post. Apologies... :-)
12:20 am on Feb 19, 2013 (gmt 0)

WebmasterWorld Senior Member leosghost is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Ok ..so you are looking at 3 which I know first hand that work ( there may be others / more, but these are the ones that I know defeat printscreen )..

Copysafe..
or locklizard..
or..ebookeditpro..( I run an old version of the ebookeditpro..works well for CHM type ebooks..stops print screen on all win versions..but I do not know if it still is available..their site has said "update in progress" forever.. ) ..I have heard that there is no "support" now..and that some buyers in the last two or three years have had problems either reaching "support" or receiving product paid for..YMMV

I also run some copysafe products..they work well ( they use their own browser for "secure PDFs"..the PDF reader browser is free and can be auto bundled with your secure PDFs made using the paid for product ) ..This would be my suggestion ..

Locklizard is waaay more money..I've used it ..works.. ..may well be overkill / overspend for what you need though..

I have no association with any of them..

HTH :)
1:16 am on Feb 19, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Big issue here is what platform this should be OCR'd into

Doesn't have to be platform-specific. Proofreading can even be done online. Plain text (.txt) is also platform-independent. Just make sure everyone is using the same file encoding or you will end up in a horrendous mess.

The proof reading of OCR'd text, we already have that lined up fairly inexpensively.

:: peering into crystal ball ::

Filipino readers who are accustomed to Roman script but don't know English, so they will be matching character-for-character without the brain silently correcting typos and/or scannos.
1:28 am on Feb 19, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



One journal that I read from time to time, always has to issue a couple of corrections and notes in the following paper issue.

In the electronic version, someone takes the time either to apply those corrections or at the very least insert the note about what is wrong, right next to where the error is.
2:58 pm on Feb 19, 2013 (gmt 0)

10+ Year Member



Doesn't have to be platform-specific. Proofreading can even be done online. Plain text (.txt) is also platform-independent. Just make sure everyone is using the same file encoding or you will end up in a horrendous mess.


Since its a magazine we need a package which can scan the documents into an editable form while leaving the images intact and in the same page location as the original. Plain text files won't do.


:: peering into crystal ball ::

Filipino readers who are accustomed to Roman script but don't know English, so they will be matching character-for-character without the brain silently correcting typos and/or scannos.


That's a poor assumption. The proof-reading is being handled by some local USA based college students.
10:51 pm on Feb 19, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Since its a magazine we need a package which can scan the documents into an editable form while leaving the images intact and in the same page location as the original.

And...? Look at some of the existing web-based proofreading sites like runeberg.

The proof-reading is being handled by some local USA based college students.

You'd have done better with non-English speakers. (Appearances to the contrary, US college students do not qualify.) Native speakers are the most likely not to notice simple errors like line-break duplication, or standard scannos like u:n or o:e:c.
12:26 pm on Jun 24, 2013 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



mh,
Four months have passed since this thread was active.
Do you have a solution in place?
1:34 pm on Jun 24, 2013 (gmt 0)

10+ Year Member



They decided to use the issuu web site for their existing digital files for the magazine. For the previous non-digital versions rather than OCR they plan to simply scan them in and present them as images behind a user login.

It's certainly not optimal imho because I believe rights management isn't there - just a simple login, if the user is sophisticated there are tools to download the SWF to their local computer and then they can republish it anywhere. But they said they haven't seen any republishing of their stuff in the several years they've used it for other items. One of the things on my plate is to setup an account for them with a duplicate content service to try to keep an eye on things after the transition. Not the best, but it's what will fit in the budget.

What basically happened is the more information I gave them the more their faces fell as they realized there is no cheap way to get so much content online (nearly 20 years worth) in a very secure environment.
9:46 pm on Jun 24, 2013 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



there is no cheap way to get so much content online (nearly 20 years worth) in a very secure environment.


I've been digitizing previously published widget magazines for nearly fifteen years.

I do have some key articles online.

Unfortunately with 24,000+ articles (OCR'd) and 30,000+ images I've not found a secure method either.

Expiring PDF's (despite their large size and time creation expense) of combined images and OCR seem to be the only alternative for restrictions.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month