Forum Moderators: phranque

Message Too Old, No Replies

Converting Word Document to Web

         

kevinj

2:29 pm on Oct 18, 2003 (gmt 0)

10+ Year Member



I have a client that has an 800 page Microsoft Word document and they want to convert it to an online version. They want the final product to be fully text searchable and include a table of contents/index with text links to the pages. The output should be readable in a web broswer. I found a product by Virtual Media Technology called ReWorx 2003 that looks like it would be a possibility. Has anyone had any experience with this product or another product they can recommend? We want to avoid Acrobat.

Thanks.

limbo

9:33 pm on Oct 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ouch!

Are you ready for this....

You might find the only way to get what you want it to cut and paste the text via a text editor!(or save as plain text)

You will then need to mark it up yourself - but over 800 pages I do not envy your task. In fact why try and present 800 pages of text over the web? that is a serious book! is it really necessary?

No-one (IMO) wants to read that much info on-screen unless it it really something quite special (a novel?)

perhaps a PDF would be a better option - split into 5-10 reasonable downloads? (I know you want to avoid it)

You could save the page as Word HTML (ha!) but this will result in source code so messy you may as well sack that off! MS is very, very bad at this kind of conversion!

800 pages of word - it's my worst nightmare!.........

The_Hat

10:27 pm on Oct 18, 2003 (gmt 0)

10+ Year Member



I have heard that DreamWeaver MX has the ability to import Word Html and clean it up at the same time.

Marcia

10:37 pm on Oct 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's very slow and painful. MX might but DW2 doesn't and there hasn't been any other reason to upgrade so far. It's a long drawn-out operation with copying the text only twice. Then it comes out with no paragraphs or breaks and has to be marked up all over again and formatting it all from scratch because it's all lumped together in one long paragraph.

It's taking about two weeks of wrist-busting work to do some pages before a site can even be started, and it's a mere fraction the number of pages.

>>any experience with this product or another product they can recommend?

No, there are some good programs that will strip HTML tags but what Word puts in isn't normal. If special software is needed it should be client provided or figured into the bid, or it's a losing proposition.

I won't accept Word docs any more, it'll have to be plain text.

limbo

10:58 pm on Oct 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have heard that DreamWeaver MX has the ability to import Word Html and clean it up at the same time

<chuckle> there is only so much you can do! - even DW gets tired of the lame markup that Word exports! ;)

Mohamed_E

3:01 am on Oct 19, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From the HTML-Tidy docs:

Support for Word2000

Tidy can now perform wonders on HTML saved from Microsoft Word 2000! Word bulks out HTML files with stuff for round-tripping presentation between HTML and Word. If you are more concerned about using HTML on the Web, check out Tidy's "Word-2000" config option! Of course Tidy does a good job on Word'97 files as well!

Since I have never had to convert a Word document to HTML I have not checked this feature out.