httpwebwitch

msg:3766458 | 9:00 pm on Oct 15, 2008 (gmt 0) |
The closed format is the "proper" format for a node with no children. But I know what you mean - some HTML tags (like <script> and <textarea>) mustn't be closed like that, or Bad Things May Happen. I have that problem with a certain .NET XSLT parser. It has problems with all empty nodes - and it totally barfs on "<textarea></textarea>". There is no config option available to fix it, so we've had to put spaces in all our nodes to keep them from self-closing... it's a nasty solution that has caused several other problems further up the stack. <node> </node> You might also try: <node>&NULLENTITY;</node> where NULLENTITY is declared to be NULL (or, an empty string) in the DTD, like this: <!ENTITY NULLENTITY ""> Not being a Javaist I don't know if any of the above will work
|
Fotiman

msg:3766466 | 9:14 pm on Oct 15, 2008 (gmt 0) |
Thanks. In our case, the output is an RSS feed. But the company that ingests it must not be using a standard parser. Thanks for the suggestions. If I can't find a way to force the end tag to be generated, then perhaps an empty string will work.
|
httpwebwitch

msg:3767095 | 2:32 pm on Oct 16, 2008 (gmt 0) |
if you have the XML as a string, maybe there's a reliable way to do it with a REGEX replace() like, <([^\s]*)([^>]*)/> replace with <$1$2></$1>
|
Fotiman

msg:3773037 | 6:53 pm on Oct 24, 2008 (gmt 0) |
Thanks. I'm using a Java's String method replaceAll like this: s = s.replaceAll("<([^\\s]*)([^>]*)/>", "<$1$2></$1>");
|
| That regex turned this: <media:thumbnail height="100" url="http://example.com/a.jpg" width="133"/> into this: <media:thumbnail height="100" url="http://example.com/a.jpg" width="133"><//media:keywords>> Close, but not quite. Any suggestions?
|
httpwebwitch

msg:3773076 | 7:52 pm on Oct 24, 2008 (gmt 0) |
| <media:thumbnail height="100" url="http://example.com/a.jpg" width="133"><//media:keywords>> |
| OK that's just weird. where did "/media:keywords>" come from? it's not in the matched string; in that spot (highlighted red above) should be "media:thumbnail". Is something amiss with the Java replaceAll() method?
|
Fotiman

msg:3773098 | 8:18 pm on Oct 24, 2008 (gmt 0) |
The element before this one. Here's a more complete XML snippet: <media:keywords>Example</media:keywords> <media:thumbnail height="100" url="http://example.com/a.jpg" width="133"><//media:keywords>>
|
Fotiman

msg:3773104 | 8:23 pm on Oct 24, 2008 (gmt 0) |
<([^\\s]*) Matches: < + any non-whitespace character (including /) Perhaps I need this: <([^\\s/]*)([^>]*)/> ?
|
Fotiman

msg:3773118 | 8:45 pm on Oct 24, 2008 (gmt 0) |
Just gave it a try and that seemed to be it. :)
|
httpwebwitch

msg:3773226 | 2:07 am on Oct 25, 2008 (gmt 0) |
excellent!
|
|