Page is a not externally linkable
encyclo - 4:15 pm on Jul 19, 2006 (gmt 0)
A perfect example is the very first baby-steps of semantic metadata in document: the meta keywords tag. Google (or any other search engine or classification mechanism) simply cannot rely on this metadata as being useful or descriptive as it is abused far more than it is used correctly. Google's ranking mechanisms were the first which were the antithesis of the semantic web ideal - discounting heavily the document metadata and even document contents and assigning relevance in relation to third-party data such as inbound links (this is simplifying Google's algo to the extreme, but is basically true). As it is, it is more often the search engine which provides the semantics via its algo rather than the utopian RDF/metadata approach. This isn't Google being arrogant, Mr. Norvig is simply stating the current state of affairs as seen on the web today. Useful reading: Metacrap: Putting the torch to seven straw-men of the meta-utopia [well.com] (an old classic from 2001)
Webmaster incompetence (in a technical sense as mentioned by Mr. Norvig) is only one aspect of the problem facing a semantic web. A bigger problem is with webmaster deception: that is, any mata data contained within a document cannot be relied upon as being descriptive of the document's contents as the publisher of that document may be exaggerating, falsifying or manipulating that metadata.