Forum Moderators: coopster

Message Too Old, No Replies

CaRP-unencoded ampersands validation problem

         

lokjah

5:24 pm on Nov 24, 2004 (gmt 0)

10+ Year Member



Hi There, Ive been working with the CaRP RSS parser and am very pleased with it except it wont validate as XHTML because of unencoded ampersands in the URL generated.

Ive found this bit of code on line 455 in the carpinc.php

if ($fp=OpenRSSFeed($url)) {
$xml_parser=xml_parser_create(strtoupper($carpconf['encodingin']));

if (strlen($carpconf['encodingout'])) xml_parser_set_option($xml_parser,XML_OPTION_TARGET_ENCODING,$carpconf['encodingout']);
xml_set_object($xml_parser,$rss_parser);
xml_set_element_handler($xml_parser,"startElement","endElement");
xml_set_character_data_handler($xml_parser,"characterData");
$CarpRedirs=array();

$rss_parser->PrepTagPairs($carpconf['desctags']);
while ($data=preg_replace("/&(?!lt¦gt¦amp¦apos¦quot¦#[0-9]+)(.*\b)/is","&\\1\\2",preg_replace("/\\x00/",'',fread($fp,4096)))) {
if (!xml_parse($xml_parser,$data,feof($fp))) {
CarpError("XML error: ".xml_error_string(xml_get_error_code($xml_parser))." at line ".xml_get_current_line_number($xml_parser));
fclose($fp);
xml_parser_free($xml_parser);
return;
}
$data='';
}

I'm wondering if this bit here: ($data=preg_replace("/&(?!lt¦gt¦amp¦apos¦quot¦#[0-9]+)(.*\b)/is","&\\1\\2"

is supposed to take care of that? this is the only place in all of the CaRP files that I can find the ampersands encoded &

Is anyone here familiar with CaRP or can tell me if this looks like what its doing here? If it is, then for some reason its not generating it in the URL.

I realize it can be a really tedious big deal to debug but I'm an extreme novice in php and could use the help tremendously...(css and xhtml are more my bag)

thanks alot

lok

ergophobe

11:09 pm on Nov 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It looks like it. I'm not sure why it doesn't use one of the built-in php functions for the purpose, though.

Does it fail for all urls in $data or just sometimes? If the url had &#1 in it, I don't think it would get encoded (does # have a meaning in regex?)

lokjah

12:27 am on Nov 29, 2004 (gmt 0)

10+ Year Member



thanks for the feedback ergophobe, I actually dumped it and found another one that I was able to get to validate...

lokjah

4:00 pm on Dec 1, 2004 (gmt 0)

10+ Year Member



in case anyone else has this prob, I got a response from the developer and heres the fix!

"What I'd do (and will do in the next update--thanks for pointing out the problem, by the way) is to change this line in that function:

(strlen($link=trim(str_replace('"','"',$link)))?(

to this:

(strlen($link=trim(str_replace('"','"',str_replace('&','&',$lin k))))?(

Antone"

ergophobe

6:42 pm on Dec 1, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for following up!