Forum Moderators: coopster

Message Too Old, No Replies

Text encoding conversion

         

iceman22

4:09 pm on Jan 30, 2005 (gmt 0)

10+ Year Member



I've written an RSS feed parser, the source uses UTF-8 text encoding. When I include that file in any page, it will appear properly if the page uses UTF-8, if it uses ISO-8859-1, accented/umlauted/etc. characters do not appear properly.

I'd like to just convert the string in the script from UTF-8 to whatever charset defined in the script, but I haven't figured out how to do that. Ideally the script would get the encoding used by the current page but I don't think that would be possible if the PHP script is included into another, which is it's main application.

I've tried using the iconv_get_encoding, iconv_set_encoding and mb_convert_encoding functions, all of them give me an error that they are undefined. This is strange because I use PHP version 4.3.10, so the functions are supported in this release.

On an unrelated problem I've spent some time over, I need to download a single line of an offsite HTML file, perhaps using curl or wget? I looked through the options a couple of times but maybe I've missed something, starting at an offset won't work because the above lines will have varied lengths.

mincklerstraat

4:48 pm on Jan 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Both mb_ and iconv_ functions aren't enabled by default, even though they're available for your php version. For the less-standard sort functions, it helps to look at the main page about those functions - [be.php.net...] , [be.php.net...] . This is most likely why you're hearing from PHP that those functions aren't defined. Can't help you much further than that, maybe someone else here can.

unrelated problem: you could simply use

file()
if you have url wrappers on - it will put the file into an array line-by-line.
file_get_contents()
will just put it all into a single variable.

iceman22

3:06 am on Jan 31, 2005 (gmt 0)

10+ Year Member



Thanks for the reply, doesn't seem like there is anything I can do from my end with those functions. I'd like this parser to have a good level of compatibility so I can make the code for it freely available. Any ideas on the encoding problem would be much appreciated.

On the unrelated problem, at the moment I use file_get_contents() to get the html file, then use awk to get the line I need (it's faster than PHP). The problem is that it is quite large and I just want to get one line from the file. With curl it's possible to get a specific offset but with varying sizes of lines above the one I want, it wouldn't work.