Forum Moderators: open

Message Too Old, No Replies

converting word doc into html

         

jerrynyc

3:31 pm on Jan 7, 2004 (gmt 0)

10+ Year Member



I need to post some articles on a web site and am wondering the best way to do this. Is there an easy way to convert a Word doc directly into html? There are some thiry or so articles each about 2-3 pages long. I have been warned the the Word "convert into html" is pretty bad. I plan to use css to do the main formating of these articles.

Thanks

Jerry

korkus2000

3:49 pm on Jan 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You really can either do the convert and clean the code, or copy and paste into an editor. It really depends on how complicated the word formatting is. I have heard of programs that will clean a word html doc, but can't remember off hand what they were called.

jimbeetle

3:54 pm on Jan 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hey jerrynyc,

Kind of chilly in nyc today.

Not sure which version of Word it started in, but at least as of XP there is a save as "filtered web page" option which strips out most of the Word-specific code. Haven't used it as yet, but if you do would still run it through a clean up utility such as HTML Tidy.

If you're going to spec the pages with CSS anyway I'd probably just save as text and do it by hand, much more control and you know exactly what's what.

The are some plug-ins and filters that handle earlier Word versions but I think with the not huge amount of converting you have to do they might not be worth the hassle.

Jim

jerrynyc

4:25 pm on Jan 7, 2004 (gmt 0)

10+ Year Member



Cold here in NYC? hell no! My ancestors come from the plains of mother Russia where this is consided a warm spring day.
I don't think that these articles were made with later versions of Word as many are rather old. As I use a Mac is there a inexpensive tidy program for the Mac?
Jerry

txbakers

5:12 pm on Jan 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IMO I would copy the word doc into a bare text editor like BBEdit for the Mac and add the CSS or HTML tags manually.

It will save you and your visitors so much time later.

Or, if it needs to be kept in a Word format, just upload it to the server as Word and make a link to it rather than try to display it in the browser.

tedster

5:42 pm on Jan 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I do a lot of this - and there is no easy way I've found that results in clean HTML.

However, before I copy/paste the article into my HTML editor, I use Word's "search and replace" to find all the paragraph breaks and replace them with <p>, and the same for replacing line breaks with <br>

In Word you can search on ^p for paragraph breaks and ^l for line breaks. It's a help.

choster

5:53 pm on Jan 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How's this grab you: I recorded an extremely simple Word macro that converts carriage returns to </p><p>, boldface to <strong>, and so on. This takes care of 90% of the markup needed to properly format about 75% of the content I receive, and then I can cut and paste the output into Homesite with a head start. It's faster than cutting and pasting from the original (articles are typically about 1500 words long) and much faster than exporting from Word 2002 and cleaning up the output code.

I wish I knew macros well enough to automate other common tasks, like making e-mail addresses linkable and especially outputting linked footnotes. I may have an article from a scholarly journal to mark up that has thirty or forty notes, quite a pain to track down and crosslink when cut and paste turns them into ordinary numbers. Yet even here, my productivity is probably similar because longer articles mean more bizarre Office interpretations of of bullets and indents and capitalization in dozens of classes to strip out and make sense of-- DW's "clean Word HTML" can only take you so far.

jerrynyc

6:31 pm on Jan 7, 2004 (gmt 0)

10+ Year Member



There is realy a lot of useful ideas here for me. I like the idea of the search and replace as well as recording a macro to do this.Many thanks to everyone who replied.

Jerry