homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / XML Development
Forum Library, Charter, Moderators: httpwebwitch

XML Development Forum

Splitting Description field
How can I separate these items

 12:26 pm on Dec 11, 2006 (gmt 0)

I have an XML feed which includes a description element like the one below:

<![CDATA[ <img src='http://www.example.com/images/1_thumb.jpg' border='0' /><br/>
Text description of item

At the moment I'm parsing it so that the entire element is output "as is", with the image, then the line break, then the text description.

Is there any way of parsing this so that I can get the image and the text description as separate elements?

P.S. My parsing knowledge is minimal, at the moment I have just adapted an existing script to do this for me.




 4:28 pm on Dec 11, 2006 (gmt 0)

When you've wrapped content in a <![CDATA[]]> section, you're asking the XML parser to treat the contents as text. CDATA is not parsed by definition, and the escaped pseudo-elements within it are not part of the document tree.

That said, there are parser-specific extensions that will read text into nodes, such as saxon:parse(). Check the documentation on your parser to see if such a function is supported.


 8:12 pm on Dec 11, 2006 (gmt 0)

From my limited knowledge, that's what I thought. But then the owner of the feed said ..

"RSS parser could do it - so could CSS"

Is it really not possible then?



 8:21 pm on Dec 11, 2006 (gmt 0)

CSS or XSL? They do very different things.


 8:37 pm on Dec 11, 2006 (gmt 0)

The quote is verbatim - CSS.

But could it be done with XSL? If so, can you give me an indication how?

[edited by: WebWalla at 8:44 pm (utc) on Dec. 11, 2006]


 4:45 pm on Dec 12, 2006 (gmt 0)

An RSS parser or and CSS parser (i.e. browser) is consumer-level, and is probably going to be more "forgiving." But the parser is supposed to ignore CDATA. That is an XML-wide rule, not something unique to XSL.

As I noted, there are extensions to the common parsers which will force it to interpret the contents of a CDATA section as XML nodes, but these are specific to the parser being used.


 7:58 pm on Dec 12, 2006 (gmt 0)

OK, I get what you're saying.

I'm using the rss2html script to parse this feed. I think I'll just have to wait until the feed owner changes the format.

Thanks for the info.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / XML Development
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved