Welcome to WebmasterWorld Guest from 54.145.44.134

Forum Moderators: httpwebwitch

Message Too Old, No Replies

Illegal Charcters in XML feed

     

iamvela

9:16 am on Jul 15, 2008 (gmt 0)

5+ Year Member



I have an XML feed that is based upon text submitted by users, however every so often users submit characters taht are illegal for XML causing the entire feed to choke :(

I need some help in filtering out (brute force replace is ok) these ilegal characters.

TIA,

eelixduppy

9:22 am on Jul 15, 2008 (gmt 0)

WebmasterWorld Senior Member eelixduppy is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Whatever script you are using to parse the feed is what you need to replace the characters. For example, with PHP you can use str_replace(). I, on the other hand, would run the XML through W3C's validator service to see if it is valid XML before using it--if not then show an alert of some kind.

httpwebwitch

2:36 pm on Jul 15, 2008 (gmt 0)

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member



you can either pasteurize the text to remove/replace those characters, or you can wrap them in a special CDATA placenta.

So, for instance:

<book>
<title>The Big Book of &lt;XML&gt; &amp; &amp;agrave;cc&amp;eacute;nted char&amp;agrave;ct&amp;egrave;rs</title>
</book>

OR:

<book>
<title><![CDATA[The Big Book of <XML> & àccénted charàctèrs]]></title>
</book>

the CDATA is a far better solution

httpwebwitch

2:39 pm on Jul 15, 2008 (gmt 0)

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member



FYI regarding CDATA:
[w3schools.com...]

Unless you know that the user-entered data is safe, like it's only EVER going to be an integer or alphanumeric string, then treat it as CDATA and encapsulate it accordingly.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month