Forum Moderators: open

Message Too Old, No Replies

Clean html code

many files at once

         

sionvalais

6:55 pm on Sep 28, 2003 (gmt 0)

10+ Year Member



During the years the 350 pages of my website were updated by hand. The source code is therefore almost unreadable by now.
Is there a tool that cleans many webpages at once of the many white lines and restructures the layout of the code?

cheers,

Mark

grandpa

7:53 pm on Sep 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hiya Mark,

I hope there isn't such a tool since I recently finished cleaning up the pages on my site. Fortunately, most of my pages have a standard layout, so after the first one was done I had an easier job cleaning up the rest.

I am curious, is Front Page the culprit for inserting all those blank lines in my code? My contact page must have had 30 blank lines between each valid line of HTML. Even the JS was "spread out".

Jeff

skipfactor

8:05 pm on Sep 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Dreamweaver has a command, 'Clean up HTML' with many options that might help. Dreamweaver's advanced 'Find & Replace' is also quite handy as you can find & replace code by the page, folder, or sitewide.

sionvalais

8:09 pm on Sep 28, 2003 (gmt 0)

10+ Year Member



Hi Jeff,

I ditched Frontpage a long time ago. It has some annoying habits. It inserts lines like this:
<meta name="GENERATOR" content="Microsoft FrontPage 6.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
It even changes some non-html code when you save the page.
I prefer editors like Emeditor (custom editting) and Ultraedit (mass editting).
Did you ever see the source of a word document which was converted to html?
It probably answers your question.

cheers,

Mark

sionvalais

8:11 pm on Sep 28, 2003 (gmt 0)

10+ Year Member



hee skipfactor,

Is that the case in the new version of Dreamweaver (2004)?
... and is it able to search and replace on conditional statements?

bye

mark

willybfriendly

8:28 pm on Sep 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you tried HTML Tidy?

WBF

BlueSky

8:38 pm on Sep 28, 2003 (gmt 0)

10+ Year Member



HTML Tidy is pretty good. Even that tool though can only do so much with a webpage butchered by Word or FrontPage.

willybfriendly

8:47 pm on Sep 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't know how well it works, but Tidy now has a word-2000 option that can be set in the config file.

WBF

andy_boyd

9:15 pm on Sep 29, 2003 (gmt 0)

10+ Year Member



I can testify to the disaster that is FrontPage code. I inherited a 450 + page site built completely with Front Page. Things were so bad I am now redesigning the entire site. I did an initial clean up by hand and removed several k's of unnecessary code per page.

Nightmare - be warned, stay away from Front Page.

WebJoe

9:49 pm on Sep 29, 2003 (gmt 0)

10+ Year Member



@sionvalais:
It inserts lines like this:
<meta name="GENERATOR" content="Microsoft FrontPage 6.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
It even changes some non-html code when you save the page.

=> you can change that in options, none of my pages have any of the lines mentioned. The only changes I've seen FP make to my code is the wizard-inserted asp-stuff that I changed manually (and made mistakes that FP corrected)...
But I agree, word2html is a mess!

@andy_boyd: I manage some 550+ pages, all in FP, with no messy html...as mentioned, it's a matter of configuration!

for more go FrontPage vs. DreamWeaver [webmasterworld.com]

pendanticist

10:20 pm on Sep 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In response to what BlueSky mentioned, I'd like to throw this out there.

http*//office.microsoft.com/officeupdate/default.aspx

W/2000 [office.microsoft.com] "add-ons" specifically (sixth one down) for cleaning up html in Office, yet I swear MS used to have an add-on for FP too. Can I find it now? Nope.

"Office 2000 HTML Filter 2.0

The Office HTML Filter is a tool you can use to remove Office-specific markup tags embedded in Office 2000 documents saved as HTML."

Would those help anyone?

I'd never use FP either...just for the reasons mentioned by others.

Pendanticist.

WebJoe

9:45 pm on Sep 30, 2003 (gmt 0)

10+ Year Member



pendanticist, thanks for the link, I sure can use it for copy of my "content-provider" that usually come in M$-word and I have to choose between exporting it to text and manually reformatting it or clean out all the M$-garbage.

g1smd

8:08 pm on Oct 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is a tool to tidy up Microsoft generated HTML, somewhere in the Webmaster Toolkit that I stumbled over a few months ago. A Google search should lead to it.