Forum Moderators: coopster

Message Too Old, No Replies

cURL returning altered output

PHP version output different to standalone cURL

         

brotherhood of LAN

11:15 am on Oct 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hello all,

I have a small problem regarding the output of curl with PHP in regards to fetching & parsing RSS feeds and unusual characters within the feeds, such as the Euro sign.

Example feed
<snip>

When using curl as a standalone and outputting to file, the Euro character is retained. When using PHP/cURL, the Euro sign is converted into 2 or more "garble" characters.

Is there a particular curl_setopt option I should be using? cURL on its own and file_get_contents() seem to retain the character as is, but my current PHP/cURL function doesn't.

None of the current curl_setopt options I use are related to encoding, i.e. just CURLOPT_TIMEOUT etc.

Any thoughts? feedback appreciated...

[edited by: brotherhood_of_LAN at 11:17 am (utc) on Oct. 25, 2007]

[edited by: dreamcatcher at 11:21 am (utc) on Oct. 25, 2007]
[edit reason] no urls as per T.O.S [webmasterworld.com].Thanks [/edit]

hughie

11:41 am on Oct 25, 2007 (gmt 0)

10+ Year Member



I am absolutely no expert on curl, but this little script works nicely to emulate a browser with a specific charset (i found this on the php.net/curl site, not sure where.

$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0].= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.

then just user
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);

No idea whether that will work for your problem but it's worth a go...

cheers,
hughie

brotherhood of LAN

12:02 pm on Oct 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



thanks for that, it doesn't seem to cure the problem though.

hmm, I've found the issue. Apparently setting CURLOPT_HEADER to TRUE causes this effect, at least on my version(s).

Cheers.

[edited by: brotherhood_of_LAN at 12:12 pm (utc) on Oct. 25, 2007]