Welcome to WebmasterWorld Guest from 54.166.112.74

Forum Moderators: bill & werty

Message Too Old, No Replies

rss not displaying with acute accents

   
1:02 pm on Sep 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi,

accents are causing our rss feed not to be displayed. we encode them correctly as e.g. ú but it gives the error:

Reference to undefined entity 'uacute'. Error processing resource '/rss/news.xml'. Line 55, Position 50

i am puzzled as to why this happens if they are correctly encoded?

7:02 pm on Sep 30, 2005 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



What RSS version are you using, jamie? Character entities aren't valid for RSS versions other than Netscape RSS 0.91 (not Userland's RSS 0.91). Also, what character encoding are you declaring?
10:12 am on Oct 1, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi encyclo,

we use:

$encoding =(string) 'ISO-8859-1';
$version = '1.0';

thanks

7:10 pm on Oct 1, 2005 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Are you getting your text out of a database? If so, can you try using the actual accented characters rather than the HTML entities? If you have your encoding correct as ISO-8859-1, then letters such as , , etc. should OK.

The problem is that the character entities were made for HTML, not XML - the early Netscape RSS specification had a doctype which declared a bunch of character entities (and so they are valid in such a context), but other RSS versions have no doctype and so no entities.

You say also that you are using RSS 1.0? That's a pretty rare beast these days with RDF syntax. You might be better off using RSS 2.0 which is XML-based, or if you can't change the character entities and have only basic needs, try a Netscape 0.91 format:

<?php header('Content-type: text/xml; charset=ISO-8859-1');
echo '<?xml version="1.0" encoding="ISO-8859-1"?>';?>
[b]<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">[/b]
<rss version="0.91">
<channel>
<title>Feed title</title>
<link>http://www.example.com/</link>
<description>Feed description</description>
<lastBuildDate><?php echo gmdate( "D, d M Y H:i:s", getlastmod());?> GMT</lastBuildDate>
<language>fr</language>

<item>
<title>Item title</title>
<link>http://www.example.com/page/</link>
</item>
</channel>
</rss>

Don't forget the Feed Validator [feedvalidator.org] to see how things come out.

I have found that the best approach is to encode the feed as UTF-8 without character entites and as an RSS 2.0 feed. As always, YMMV ;)

6:48 am on Oct 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



encyclo,

thanks for the help, i've had to tone it down to 0.91 to include the accents.

the news feed is the same one which is published in our site and it seems a shame to take the accents off. the 0.91 works fine though.

i couldn't quite get the hang of UTF - i started reading and realise i need to delve into it a bit more deeply. for the moment though it's fine.

cheers