Welcome to WebmasterWorld Guest from 18.104.22.168
I am hoping someone can help me.
This is the scenario:
I use a form with multipart/form-data to upload text (probably Word but do not want to restrict to just .doc) documents - and add data to the database.
The word file is saved to the database in a field type Blob - but when I view the content it has a load of extra characters in the header and footer and does not keep the format.
I am presuming this has something to do with charsets (but not sure where to start if so) but I am not sure if there is anything I can do on the upload to strip the extraneous characters out.
I have read many things regarding converting Word to html and pretty much all of it involves going into Word and saving as html or importing a Word doc into Dreamweaver. These are not options due to the nature of the website. The file/character manipulation MUST be done on the fly.
Any ideas would be gratefully received!
I have tried using the header Content-Type:application/msword header but it just sets the link as a Word download rather than showing the text.
I have played about with Charsets but to be honest I'm not too sure what I'm doing. I am sure someone out there must have tried to do this themselves...I hope!
However, Word being Microsoft, means that there are lots of extra characters added to the content - they are what I need to get rid of!