Welcome to WebmasterWorld Guest from 54.226.25.231

Forum Moderators: incrediBILL & lawman

Message Too Old, No Replies

Word html documents

oh my god...

     

edit_g

4:36 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just for a laugh I decided to see what would happen if you tried to create a html document using MS word.

What follows is the unedited source of a blank html document:

<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<link rel=File-List href="./Document2_files/filelist.xml">
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Author>My name was here</o:Author>
<o:Template>Normal</o:Template>
<o:Revision>1</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Created>2003-02-21T16:34:00Z</o:Created>
<o:Pages>1</o:Pages>
<o:Company></o:Company>
<o:Lines>1</o:Lines>
<o:Paragraphs>1</o:Paragraphs>
<o:Version>9.2720</o:Version>
</o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
</w:WordDocument>
</xml><![endif]-->
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-parent:"";
margin:0in;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman";
mso-fareast-font-family:"Times New Roman";
mso-ansi-language:EN-US;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;
mso-header-margin:.5in;
mso-footer-margin:.5in;
mso-paper-source:0;}
div.Section1
{page:Section1;}
-->
</style>
</head>

<body lang=EN-GB style='tab-interval:.5in'>

<div class=Section1>

<p class=MsoNormal><span lang=EN-US><![if!supportEmptyParas]>&nbsp;<![endif]><o:p></o:p></span></p>

</div>

</body>

</html>

Would it even work? Or am I being silly, and this is all valid stuff?

korkus2000

4:39 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member korkus2000 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



It works in IE atleast. Scary isn't it. Word loves xml data islands. You want somemore scary stuff. Export powerpoint to html. Yikes!

sem4u

4:56 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member sem4u is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yes that looks pretty bad. Try putting it in Dreamweaver and use the 'clear up Word HTML' command!

Chris_R

4:58 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I know someone that almost went to a mental institution trying to clean up word html for a client's site they had done themselves.

It is amazing all the extra crap they put in there.

edit_g

5:02 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Dreamweaver and use the 'clear up Word HTML' command!"

This was an experiment. I usually use Editplus2.

Just so any of you don't completely misjudge me here... ;)

weisinator

5:04 pm on Feb 21, 2003 (gmt 0)

10+ Year Member



I know of a non-profit organisation with a web presence fueled by MS Word.

Instead of creating a new page for a lengthy article, they just stack it on top of others, blog-style. After 4 years, the index.html page now prints to about 20 pages and is about 600k.

vibgyor79

6:29 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Please don't make fun of MS Word's HTML capabilities. I must have created and uploaded atleast two websites using MS Word. Brings back lots of wonderful memories.

Before that, I was using Netscape Composer. :)

Anyway, now I have made an upgrade to.. ahem.. MS Frontpage 2002. Works like charm. ;)

rcjordan

6:34 pm on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member rcjordan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I love it when my ad clients use these "create html" quasi-utilities. It means they'll be buying traffic from me for a looooooong time.

Trisha

7:26 pm on Feb 21, 2003 (gmt 0)

10+ Year Member



Just yesterday I had to clean up a file for a new client that had been made in Word - it was horrible! Just like the code you posted, only longer! Thankfully it was only one page! It doesn't surprise me that someone could end up in a mental institution from this.

What I see as a challenge though is how do you get through to a client that what they have is garbage? It all looks the same to them when they look at it in IE. So how do you explain to them that either you will have to start all over making their site, or have to edit the garbage that Word wrote, and that either way it will cost them some money. They just don't seem to get it, like: 'the site is basically there already, it should be real easy for you to just change this or that. Their friend who made the site the first time could easily add whatever, just by opening it in word...etc.'.

(I need a better way to make money!)

WindSun

4:31 am on Feb 22, 2003 (gmt 0)

10+ Year Member



"how do you get through to a client that what they have is garbage?..."

LOL! That is SO true.

A while back a small non-profit org asked me to "help" with their website.
It had been done in a combination of FP98, FP200, FTP, Word, and a REALLY old copy of Fusion, and probably some other stuff and text editors that I did not look for. All mixed together. All done by a succession of people that obviously had no concept of what they were doing.

When I told them what needed to be done (such as removing or scaling down the 560kb graphics background) I started getting the "well, we want to keep that, and this, and don't change that.."
I finally told them it was hopeless given the limitations they gave me (and the obvious internal bickering that was going on).

All this for a site that got maybe 500 hits a month ;P

skibum

4:53 am on Feb 22, 2003 (gmt 0)

WebmasterWorld Administrator skibum is a WebmasterWorld Top Contributor of All Time 10+ Year Member



oohhhh, so that's what that stuff is.......

A buddy of mine put a little site online ad it was full of code like that, cept' the word idenifiers were removed. He was a PhD Enginering grad so I thought it was some obscure coding/formatting thing used in Engineering......and it was Word all along........

electro

6:12 pm on Feb 22, 2003 (gmt 0)

10+ Year Member



Just as a matter of comparison:

OpenOffice.org blank page

==============================
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1252">
<TITLE></TITLE>
<META NAME="GENERATOR" CONTENT="OpenOffice.org 1.0.1 (Win32)">
<META NAME="CREATED" CONTENT="20030222;18070329">
<META NAME="CHANGED" CONTENT="16010101;0">
</HEAD>
<BODY LANG="en-US">
<P><BR><BR>
</P>
</BODY>
</HTML>
=================================

Mozilla Composer -

=================================
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type"
content="text/html; charset=ISO-8859-1">
<title>Composer</title>
</head>
<body>
<br>
</body>
</html>
=================================

Dreamweaver MX 'Basic webpage' -

=================================
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>

<body>

</body>
</html>
=================================

They all appear to do quite a good job of it. Personaly, I can't stand using M$ Office at all anymore, long live Open Office!

brotherhood of LAN

6:15 pm on Feb 22, 2003 (gmt 0)

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



hmmm, i think this is why frontpage creates a little "extra" code here and there (not as much as word, for sure!).

All MS office programs seem to be interoperable..ie save htm as xls, xls as cvs etc etc.

I guess any form of "HTML tidy" would have a hard time ironing all all the code.

hakre

6:23 pm on Feb 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



just don't use word anylonge for this. i'm just using staroffice (not open office, but quite the same), and the html is also packed with stylesheet stuff etc. . i don't like it either. maybe dreamweaver really is a solution for converting longer documents out of a wordprocessor.

electro

6:31 pm on Feb 22, 2003 (gmt 0)

10+ Year Member



The solution I use - copy all the text out of the page to a .txt file and start again.

It's so much easier.

vodkabird

1:28 pm on Feb 26, 2003 (gmt 0)

10+ Year Member



Alternatively you could try Dean Allen's Word HTML cleaner which (when it works) is excellent!

http://www.textism.com/resources/cleanwordhtml/

[edited by: lawman at 12:45 am (utc) on Feb. 27, 2003]
[edit reason] delinked [/edit]

creative craig

1:40 pm on Feb 26, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Before I started to working for my current employer the company intranet was a 600 page word hell :(

Layout was good but the code sucked.. its all good now though :)

Craig

snowman

3:07 pm on Mar 1, 2003 (gmt 0)

10+ Year Member



A non-Windows reply?

I copy-and-pasted the first mentioned HTML into a blank *"Simpletext" document and saved it as an HTML file on my Mac. I've done this before, including when I was creating my own webpage, and it workd quite well.

Then, once saved as an HTML file, I tried opening it with either Netscape 4.79 or iCab 2.82.

All I get is a blank white screen in both browsers. Sorry, the code doesn't work on us with iconoclastic, non-Windows platforms.

(*"Simpletext" is a modestly powered word processing program that comes with all Macs. It's pretty versatile and by default it only uses 512 k of RAM (you can tweak it to whatever you want of course). It handles different text styles, different fonts, search and replace, as well as (through linking with Quicktime) being able to play movies and sounds, even speaking typed text. But as far as handling plain text is concerned, it is akin to generating plain .TXT files on a Windows system with Notepad or some other program. The only thing it won't do is read Word .DOCs. For that I use another program called "Fileview" to rip the text out of ANY file, sans formatting. It's pretty well the only word processor I think of using nowadays.)

 

Featured Threads

Hot Threads This Week

Hot Threads This Month