Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Google Directory search fails with diactrics

encoding issue



2:06 am on Aug 11, 2003 (gmt 0)

10+ Year Member

Go to the Google Directory World category . [directory.google.com...] . Go to the German category [directory.google.com...] and search within that category. Then try the same in the French category [directory.google.com...] . The French one doesn't work at all.


10:40 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

The ODP switched over to UTF-8 in the RDF sometime back, but there have been some other minor encoding issues from time to time in the directory database itself.

Current plans are to convert everything to UTF-8 and that has been ongoing with the server upgrades, and was started quite some time ago for some categories. There were a few mismatches in the accept character sets for some forms, and maybe some bad data found its way in somewhere. Additionally it isn't known how Google actually uses the data they get from the ODP. Maybe they don't start out afresh with each new RDF but simply spider the datafile for changes and apply them to their existing data. That could also account for things getting out of step. I don't know any of the technical details, but you might find some hints somewhere in the notes linked from [rodan.ncc.com...] etc.


Featured Threads

Hot Threads This Week

Hot Threads This Month