Welcome to WebmasterWorld Guest from 54.237.251.98

Forum Moderators: bill & werty

Message Too Old, No Replies

rss not displaying with acute accents

     
1:02 pm on Sep 30, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:July 24, 2002
posts:1124
votes: 0


hi,

accents are causing our rss feed not to be displayed. we encode them correctly as e.g. ú but it gives the error:

Reference to undefined entity 'uacute'. Error processing resource '/rss/news.xml'. Line 55, Position 50

i am puzzled as to why this happens if they are correctly encoded?

7:02 pm on Sept 30, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9068
votes: 4


What RSS version are you using, jamie? Character entities aren't valid for RSS versions other than Netscape RSS 0.91 (not Userland's RSS 0.91). Also, what character encoding are you declaring?
10:12 am on Oct 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:July 24, 2002
posts:1124
votes: 0


hi encyclo,

we use:

$encoding =(string) 'ISO-8859-1';
$version = '1.0';

thanks

7:10 pm on Oct 1, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9068
votes: 4


Are you getting your text out of a database? If so, can you try using the actual accented characters rather than the HTML entities? If you have your encoding correct as ISO-8859-1, then letters such as , , etc. should OK.

The problem is that the character entities were made for HTML, not XML - the early Netscape RSS specification had a doctype which declared a bunch of character entities (and so they are valid in such a context), but other RSS versions have no doctype and so no entities.

You say also that you are using RSS 1.0? That's a pretty rare beast these days with RDF syntax. You might be better off using RSS 2.0 which is XML-based, or if you can't change the character entities and have only basic needs, try a Netscape 0.91 format:

<?php header('Content-type: text/xml; charset=ISO-8859-1');
echo '<?xml version="1.0" encoding="ISO-8859-1"?>';?>
[b]<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">[/b]
<rss version="0.91">
<channel>
<title>Feed title</title>
<link>http://www.example.com/</link>
<description>Feed description</description>
<lastBuildDate><?php echo gmdate( "D, d M Y H:i:s", getlastmod());?> GMT</lastBuildDate>
<language>fr</language>

<item>
<title>Item title</title>
<link>http://www.example.com/page/</link>
</item>
</channel>
</rss>

Don't forget the Feed Validator [feedvalidator.org] to see how things come out.

I have found that the best approach is to encode the feed as UTF-8 without character entites and as an RSS 2.0 feed. As always, YMMV ;)

6:48 am on Oct 7, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:July 24, 2002
posts:1124
votes: 0


encyclo,

thanks for the help, i've had to tone it down to 0.91 to include the accents.

the news feed is the same one which is published in our site and it seems a shame to take the accents off. the 0.91 works fine though.

i couldn't quite get the hang of UTF - i started reading and realise i need to delve into it a bit more deeply. for the moment though it's fine.

cheers