Welcome to WebmasterWorld Guest from 54.159.214.27

Forum Moderators: open

Converting PDF into HTML

   
12:13 pm on Oct 25, 2010 (gmt 0)

5+ Year Member



Hi,

What is the best way to get table-based data from a PDF converted into an HTML table. I used Acrobat to export as HTML, but this did not work well (too many unnecessary SPANs and the data is mixed-up).

I saved the data as .csv and was hoping to use that in combination with Regex to get table fields wrapped around the data - maybe I gave up too soon.
10:30 am on Nov 17, 2010 (gmt 0)

5+ Year Member



Hi SilverLining, I was tackling this same problem yesterday (specifically tables too) and gave up (I don't have Acrobat though).

If you managed to get to a csv of the table data formatted correctly, it should indeed be possible to process this into a HTML table.

The easiest way would be to copy the data from the csv and paste it into Word and then view the .doc in Google Docs "view as HTML" option and copy the source code.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month