Forum Moderators: open

Message Too Old, No Replies

Accepting Html in content submitted

A real pain ...

         

fischermx

7:50 am on Feb 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know if this goes here.
I have a website in which I accept user generated content to be submitted. I've not setup rules about how much or how few HTML is acceptable or not, but I'm glad to accept an article formated with <b>, <i>, some h tags, and a couple of url drops.

But sometimes, I got a sort of vomited HTML that I'm pretty sure comes from some MS tool, like that thingy frontpage, or even word.
When I get this, I just take a deep breath and re-edit the whole thing. Due the nature of my site, I feel somehow gratitude for the submitted content and don't feel like I should reject the article/PR to the author.

What do you do in that case?

shri

7:37 am on Feb 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can use PHP (I think you're on ASP right?) to automatically strip tags from submitted content.

If you can find a function like this one ( [hk2.php.net...] ) then you can automate the submission process to remove tags when the author submits it.

choster

6:38 pm on Feb 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is a growing problem for us now that Outlook uses Word's HTML editor. FCKeditor and TinyMCE, two browser-based all-Javascript WYSIWYG, generate cleaner code and have some "Word HTML auto-cleanup" functionality, which has helped. In the past, I used Dreamweaver's "Clean up Word HTML" feature as well.