Welcome to WebmasterWorld Guest from 18.104.22.168 , register , free tools , login , search , subscribe , help , library , announcements , recent posts , open posts Subscribe to WebmasterWorld
Illegal Charcters in XML feed iamvela msg:3698692 9:16 am on Jul 15, 2008 (gmt 0) I have an XML feed that is based upon text submitted by users, however every so often users submit characters taht are illegal for XML causing the entire feed to choke :(
I need some help in filtering out (brute force replace is ok) these ilegal characters.
eelixduppy msg:3698695 9:22 am on Jul 15, 2008 (gmt 0)
Whatever script you are using to parse the feed is what you need to replace the characters. For example, with PHP you can use str_replace(). I, on the other hand, would run the XML through W3C's validator service to see if it is valid XML before using it--if not then show an alert of some kind. httpwebwitch msg:3698880 2:36 pm on Jul 15, 2008 (gmt 0)
you can either pasteurize the text to remove/replace those characters, or you can wrap them in a special CDATA placenta.
So, for instance:
<title>The Big Book of <XML> & &agrave;cc&eacute;nted char&agrave;ct&egrave;rs</title> </book>
<title><![CDATA[The Big Book of <XML> & àccénted charàctèrs]]></title> </book>
the CDATA is a far better solution
httpwebwitch msg:3698887 2:39 pm on Jul 15, 2008 (gmt 0)
FYI regarding CDATA: [ ...] w3schools.com
Unless you know that the user-entered data is safe, like it's only EVER going to be an integer or alphanumeric string, then treat it as CDATA and encapsulate it accordingly.