Welcome to WebmasterWorld Guest from 54.225.27.249

Forum Moderators: ergophobe

Message Too Old, No Replies

Semantic Data Extractor

HTML semantic rich documents.

     
2:48 pm on Nov 25, 2006 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member pageoneresults is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 27, 2001
posts:12169
votes: 55


The aim is to show that providing a semantically rich HTML gives much more value to your code: using a semantically rich HTML code allows a better use of CSS, makes your HTML intelligible to a wider range of user agents (especially search engines bots).

Have you been using this tool to your advantage? ;)

Semantic Data Extractor
[w3.org...]

5:54 pm on Nov 25, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member jtara is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 26, 2005
posts:3041
votes: 0


I don't see how anybody could be using this tool to their advantage, because it is broken. That is, it just plain does not work, returning a error message.

I assume it worked until some recent update...

(I am referring to the demo on the website.)

7:13 pm on Nov 25, 2006 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9068
votes: 4


Works just fine for me, jtara. What error message are you getting? Bear in mind that the tool expects a full URI to a resource, not just the domain name, ie. http://www.example.com/ rather than www.example.com.

The tool itself is a simple and interesting little utility to see if you can extract the correct meaning from the way the page uses markup. It can be useful in pointing out potentially confusing associations. A good example which I tried showed that the contents of a sidebar (using

h4
elements for each header) were seen as being appended on the final node created by the article links in the main content area (which were marked up with
h3
elements). The fix would be to have a
h2
or
h3
element introducing the sidebar sub-headings.
8:21 pm on Nov 25, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member jtara is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 26, 2005
posts:3041
votes: 0


I get this error message:

Using org.apache.xerces.parsers.SAXParser
Exception net.sf.saxon.trans.DynamicError: org.xml.sax.SAXParseException: Content is not allowed in prolog.
org.xml.sax.SAXParseException: Content is not allowed in prolog.
8:36 pm on Nov 25, 2006 (gmt 0)

Moderator from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14172
votes: 204


It shows the meta data and the organizational outline for the page. This is useful, and it's also available as an option when you validate your HTML.