Msg#: 3487077 posted 9:51 am on Oct 25, 2007 (gmt 0)
Hi all - I am new to posting on Webmaster World although I have been browsing the forums for years!
I am hoping someone can help me.
This is the scenario: I use a form with multipart/form-data to upload text (probably Word but do not want to restrict to just .doc) documents - and add data to the database. The word file is saved to the database in a field type Blob - but when I view the content it has a load of extra characters in the header and footer and does not keep the format.
I am presuming this has something to do with charsets (but not sure where to start if so) but I am not sure if there is anything I can do on the upload to strip the extraneous characters out.
I have read many things regarding converting Word to html and pretty much all of it involves going into Word and saving as html or importing a Word doc into Dreamweaver. These are not options due to the nature of the website. The file/character manipulation MUST be done on the fly.
Msg#: 3487077 posted 5:42 pm on Oct 25, 2007 (gmt 0)
there is no plugin available for viewing word files within a browser. therefore if you are sending that content type, it must necessarily be downloaded for viewing purposes in ms word. you can try to see if there is a way to convert the word document to some usable text on upload to the server or while serving the document to the browser. your solution will depend on your server environment.
Msg#: 3487077 posted 12:48 pm on Oct 26, 2007 (gmt 0)
when you send a word document through the web, it isn't a file any more - it's just a stream of data. it doesn't matter if the source of that data stream is a file on your server or a record in your db - it's the same data to the browser. you need to figure out how to extract useful text from the word doc and store that instead. as i mention before, your solution will depend on your server environment - and i have no clue there...