Forum Moderators: not2easy
There is a youth activity web site that posts the results of competitions as PDF files. The PDFs include tables that show all the scores from each judge for each competing unit at each event.
I am using a series of processes and custom written programs to capture those individual scores and add them to a database so that we can do ad hoc queries and various reports based on the data. The organization that owns the web site has elected not to collect the scores data in any database. The web site does have a standard copyright notice.
Based on my understanding of the Fair Use rules for copyrights I believe that what I am doing is not a copyright violation but that whole area is so confused I thought I would seek options from others more knowledgable about the topic.
Here are the steps I follow to extract the scores and add them to my database (note - this is all not-for-profit. The resulting database is used by others in the activity, without charge to anyone)
- Open the PDF file using Adobe Acrobat and highlight the scores table.
- Copy/Paste that into a text editor and replace the spaces between the scores with commas and save as a CSV file
- Fix the data as needed (if the unit name included a space that was replaced by a comma, put back the space, etc.)
- Use DTS in SQL Server 2000 to import the data in the CSV file into a temporary table
- Run a custom program to "clean-up" the data (make judge names and unit names consistent with the database)
- Run another custom program to transform the data format to match what I need in the database
- Add the data to the database table.
Comments and/or opinions are appreciated.
It's like the difference between presenting a recognizable duplicate of a page from "Guinness Book of World Records", on the one hand, and, on the other, saying something like "I could not believe how far this dude could blow a piece of spagetti out of his nose! Nineteen centimeters! But ya gotta wonder: how on earth did he get started? Just exactly what does this dude do in his basement when nobody is looking?!?"
The former would be a violation; the latter would be a lawful use of raw data.
Eliz.
If you use another study's results/stats, but do it as a part of your own writing rather than a quote (A recent study found that 87% of people who listen to rock music also eat chocolate.), cite your source. Not only for their sake, but for yours; especially when you're getting into things such as statistics, you don't want a reader to think, "Yeah, sez who?" You want to tell them who sez, so they can either accept it as credible or go look up the source themselves.
If you quote the other study, make it a direct quotation: (Anderson and Jones reported, "Eighty-seven (87%) of the 100 research subjects who had reported listening to rock music also admitted on the private questionnaire to eating chocolate, as opposed to only 25 (50%) of the 50-person non-rock-music-listening control group (p<0.01).") You also want to tell us who Anderson and Jones are so we know if we should listen to them or not - that is, cite your source. [Note to any statisticians reading this: I did not actually work out that p value, so it might be wrong. ;-) ]
So far, so good. But when you want to present the results in the same way the original author(s) did - which usually means a table, graph, or other illustration - every journal I know of requires that you send them proof that you've received permission from the copyright holder (usually the journal that first published the study) to re-use it. For most journals, this is even required when you want to re-use something in a previous article that you wrote, since most agreements require the author to sign copyright over to the publisher. A few journals have gotten smart enough to save themselves the tons of paperwork needed to give permission to authors to re-use their own stuff, and are granting that right back to the authors in the copyright agreement (and, yes, I've had to send that agreement to other journals to prove that my boss has permission to re-use his own graph).
So, yeah, what I said... ;-)
Facts: no copyright
How facts are presented: copyright