Page is a not externally linkable
- WebmasterWorld
-- Accessibility and Usability
---- Semantic Data Extractor


pageoneresults - 10:49 pm on Feb 17, 2009 (gmt 0)


One thing i don't understand is the error messages. I don't know if they are reporting problems withing the document or the network.

The frigging error codes don't help much when all you want is semantic information.

Would that be this particular error?

Using org.apache.xerces.parsers.SAXParser
Exception net.sf.saxon.trans.DynamicError: org.xml.sax.SAXParseException: Content is not allowed in prolog.
org.xml.sax.SAXParseException: Content is not allowed in prolog.

That is because you are invoking the tool a second time and sending an encoded URI on that second trip. You may have to enter the URI into the field again and clear up the encoding issues. Even then, there appears to be a caching mechanism at play. I've had to restart me session to get that thing to extract the latest document changes. ;)

Pssst, we already built a better one. It just needs to be converted to jQuery and is in the pipeline for production. That tool gave me all sorts of ideas back in the day. :)


Thread source:: http://www.webmasterworld.com/accessibility_usability/3837274.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com