Forum Moderators: open

Message Too Old, No Replies

Parsing CDATA sections - possible?

Using xml_parser_create

         

WebWalla

10:27 am on Dec 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I know very little about XML or PHP, but I'm trying to adapt an existing script to parse a XML search feed.

I'm using this code to parse ...


$parser = xml_parser_create();
xml_parse_into_struct($parser,$xmlFeed,$values,$tags);
xml_parser_free($parser);

... but $xmlFeed doesn't include any of the CDATA sections in the feed. Is there any way round this?

Thanks.

Added- the CDATA sections are like this ...


<![CDATA[ this is the info I want ]]>

choster

5:01 pm on Dec 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What is the structure of the incoming XML? All a CDATA block does is instruct the parser to treat the contents as if they were escaped--to ignore tags and entities. It is not an element itself. You should still be able to access the container elements, or to manipulate the text node.

WebWalla

6:51 pm on Dec 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's the Gigablast XML search feed (if you Google for it, you can see the exact structure). I'm using the example given on their page, but with the &raw=9&nrt=0 parameters.

When I do
print_r($Values);
all data is there

When I do
echo($xmlFeed)
it's all there EXCEPT the sections labeled CDATA.

choster

10:33 pm on Dec 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you saying that for

<title><![CDATA[News article]]></title>

you get

<title></title>

or that you get nothing?

If the former, I'd check the encoding (UTF-8 vs ISO-8859-1). If the latter, well, maybe someone on the PHP board can help you.

WebWalla

9:18 am on Dec 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I still don't know what was happening, but I adapted a different script and now it's working.

Thanks.