Forum Moderators: mack

Message Too Old, No Replies

Clean up Word, using Html Tidy

clean word html htmltidy

         

Debb

6:20 pm on Sep 21, 2005 (gmt 0)

10+ Year Member



HI -

I am using word 2002 to convert a word doc given to me to thml. I want to take out the crud word leaves behind. I have tried to use html tidy with the

--clean y option, which is supposed to do this.

I am getting errors and the error txt says you have to clean up these errors before running html tidy.

The htmltidy site at sourceforge.net has no user help or support fourms, which is why I came here. I need to find someone here who know htmltidy error output or is knowledgeable about word html output, or can point me to a user forum somehwere else where people know about these topics.

ORRRRR another good product, prefereably free which can de-crud word output.

Thanks, any help would be greatly appreciated.

debb

bill

2:38 am on Sep 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I use Tidy within a number of products. It will work in editors like TopStyle and FrontPage to help clean things up. If you don't have either of those then take a look at the HTML Validator plug-in for FireFox. This will use Tidy in a GUI which should be easier than the command line of the original program.

Debb

11:40 am on Sep 22, 2005 (gmt 0)

10+ Year Member



HI -

I think I may not have been clear in my quetion.

I successfully ran tidy against the word output both in Topstyle and from the Cmd line. I need help in deciphering the error messages, and why the --clean option (to clean out word clutter) wouldn't work. I 'think' that the clean option didn't take effect because tidy wants me to clean other errors up first. I cannot find a tidy forum at sourceforge and I am running out of options here to declutter word crud.

Thanks

MamaDawg

4:25 pm on Sep 22, 2005 (gmt 0)

10+ Year Member



Can you post some of these error messages you're getting?

willybfriendly

4:38 pm on Sep 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have never had success cleaning up Word documents (or Excell either for that matter). It actually ssaves time to just start from scratch in my experience.

And yes, I have tried Tidy for this job.

WBF

moltar

4:52 pm on Sep 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is some sort of "force" flag that can force the tidy to do the work. Otherwise it's being lazy.

photon

8:29 pm on Sep 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Did you try the word-2000 option?

From the Tidy page [tidy.sourceforge.net] at Sourceforge:

word-2000Top
Type: Boolean
Default: no
Example: y/n, yes/no, t/f, true/false, 1/0
This option specifies if Tidy should go to great pains to strip out all the surplus stuff Microsoft Word 2000 inserts when you save Word documents as "Web pages". Doesn't handle embedded images or VML