homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / XML Development
Forum Library, Charter, Moderators: httpwebwitch

XML Development Forum

Are XML and JSON homeomorphic?
thus is one or the other redundant?

 8:39 pm on Mar 17, 2008 (gmt 0)

I have functions that convert XML into JSON. I also have functions that convert JSON into XML. I have reasons for using each which generally align with "XML on the server, JSON in the client". The choice of one format or another has more to do with the process that will be interpreting it than the capabilities of the format itself. I use JSON when the recipient of the data is Javascript. I use XML when the recipient is PHP and/or XSLT. One would suggest that being able to translate one format into another implies that the two are equivalent carriers for data, ie anything JSON can do, XML can do equally, and vice versa.

But being able to translate one thing to another does not make them equivalent. I can easily translate most phrases from English into French, but English doesn't provide the nuances of a gender-specific nouns, nor does French allow the fluently haphazard assembly of adjective and noun phrases that English does (for some excellent examples, read the back of any Canadian cereal box). In language, things are translatable, but there can be are points in A which do not map to B, just as we in English may not comprehend the nuances of the Inuit Yup'ik language and its famous lexicon describing snow.

But I don't think linguistics is the right framework in which to compare XML and JSON. It's just a handy metaphor.

As an aside, I could use XML in the client, and I could use JSON on the server. There are libraries for each that make it easier. The reason I use a mixture of XML and JSON in a project is the same reason I dust off my français when visiting Montréal - sure I could get by speaking English (BTW, it helps to yell things slowly as though the listener is an idiot), but life is easier when you use the prevalent local dialect.

How can we measure if two complex structures are equivalent? There is a branch of mathematics devoted to it. When I translate XML into JSON, I am changing the "shape" of the data, putting the same stuff into a form in which each data point in one is mapped to a data point in the other. In topology, this is called morphing.

A homeomorphism is a topological isomorphism. In the field of topology, a homeomorphism is a special isomorphism between topological spaces which respects topological properties. Homeomorphisms are shapes that can be mapped to each other through morphing changes that stretch space without tearing it apart or sticking pieces together. In other words, in two homeomorphic shapes A and B, everything that is in A can fit in B, and everything in B can be fit in A. They may need to stretch, compress, twist or fold, but it can be done.

A simple example of a homeomorphism is a piece of paper, and an origami crane. The paper can be folded to create the crane, and the crane can be unfolded to create the paper. In this simple homeomorphism, the morphing operation is a mere series of folds.

Another example of a homeomorphism is a coffee mug (with a handle) and a donut or torus. By stretching the space defining the surface of each, one can map every point on a cup to a corresponding point on the torus, and a path drawn between any two points on one can be mapped to a path on the other, and these paths will intersect identically on each. An animation of this morph is provided here [en.wikipedia.org]

Now the question I ask, again, is: are XML and JSON homeomorphic? Are they topological isomorphisms? Can anything in XML be represented in JSON, and vice versa, and can the two can be morphed back and forth into each other's "shape", losslessly?

I do not have the necessary mathematical chops to tackle this. I do have some pretty hand-drawn diagrams of coffee cups and torii which prove nothing, but are nice to look at.

Now having thought of the question, I pose it (post it) to the world's most intelligent online forum [webmasterworld.com].


If the answer is false, then these translation functions I've been using are inherently lossy, one way or another. It would be nice to identify how XML is being torn or stuck onto itself in order to create JSON, or vice versa, so I (we) know how not to send data betwixt the two.

If the answer is true, then we know that a perfect transfer of data between one and the other is always possible. But it also proves that the world no longer needs both formats, doesn't it?



 9:10 pm on Mar 17, 2008 (gmt 0)

I suspect that comparison is like the topic of metamerism in color management (which is something I know about, and is very similar to what you are talking about).

<digression mode="pedantic">Metamerism was first described to me as the ability of a material to retain its perceived color in different lighting conditions. It is usually [possibly incorrectly] defined as not matching in differing light sources. One of the things about English is that it is a constantly-morphing language, and the word will probably soon mean exactly the opposite of its original meaning.</digression>

In any case, when you have a blue backdrop for a theater stage that will only be used in a particular show, lit by a particular floodlight, then metamerism isn't important. It will always be blue. However, if the production will be held in outdoor arenas, or the production company will not be able to control the lighting in their venues, metamerism becomes more important.

It's all in how it will be used. For example, you can't XSLT JSON, and the available pool of parsers and languages with handling libraries/constructs is different, so, you could say the answer is false.

If, however, you mean only the simple transfer and representation of textual data entities, then the answer is most likely true.

How's that for wishy-washy?


 1:42 am on Mar 18, 2008 (gmt 0)

I think XML is probably more capable as a format than JSON, but you should use JSON for web app related scripting anyway.
The thing is, you can do really complicated things with XML, to the point where it's pretty much a programming language as far as I understand it.
JSON is really nice because it's a quick, easy way to pass data objects around (easy to read with PHP [us3.php.net] and of course with JavaScript), which is usually what you'll be doing with web apps.

(Now if you're talking about a web-based API, then that's different because people expect to get XML data from those usually.)


 2:01 am on Mar 18, 2008 (gmt 0)

The thing is, you can do really complicated things with XML, to the point where it's pretty much a programming language as far as I understand it.

i'm not sure this is really the issue wrt homeomorphism.
it is the subset of xml required to represent the conversion from json that is relevant here.


 3:05 am on Mar 18, 2008 (gmt 0)

If the answer is true, then we know that a perfect transfer of data between one and the other is always possible. But it also proves that the world no longer needs both formats, doesn't it?

I could say: If I can prove that one can drive framing nails with a sledgehammer, this proves that the world no longer needs framing hammers, doesn't it?

edited by inactivist to remove pointless verbiage


 10:00 am on Mar 18, 2008 (gmt 0)

Same as Inactivist... I'll say "True" but it doesn't mean that both (or one of them) are useless.

The anim you gave as a link was made with POVray, it reminds me of my old time - when I was not doing webdesign :) This app is very nice and very good to learn C++ like coding.


 10:50 am on Mar 18, 2008 (gmt 0)

> but English doesn't provide the nuances <

That might even apply to artificial programming languages, which are live in the same way, and subject to 'dialects' of their own. The same function, by the same name or another, might be different from one language or library to another.

In live spoken language, expectations and connotations are included in many words. Translation is not science so much as art. It will always be so. Greek says this. The modern Greek says that. And translators still disagree how it ought to read in Italian. And so on. A lot of that even goes to lost senses, lost definitions, of words even more ancient than those in use.

But what connotations or expectations, what lost senses, would be different between a mechanical translation of XML to JSON and vis-versa? I don't think it's comparable.

I would think it obvious that XML is equivalent to JSON. But that assumes the XML and JSON are simply representing data hierarchies. What goes as attribute? What goes as element? That's what another question, here, amounted to.

If it goes beyond that for a particular dataset, if there are things JSON can do and XML can't, and vis-versa, then you definitely have the possibility that something is going to be mishandled or misrepresented upon translation.


 1:52 pm on Mar 18, 2008 (gmt 0)

This is probably very low-brow of me, but... am I the only person who totally misread the title of this thread? :)


 2:06 pm on Mar 18, 2008 (gmt 0)

This is probably very low-brow of me, but... am I the only person who totally misread the title of this thread? :)

No, and I noticed that they left the title exactly the same when they posted it.


 2:25 pm on Mar 18, 2008 (gmt 0)

Well for one thing, JSON has typed objects. A fragment of JSON can be a string, number, object, array, true, false, or null. In XML, everything is a string.

As I was posing the question, I already knew one part of the answer - if I morph JSON into XML and back into JSON, to find a known object in an array I need to apply a "toString()" to the array identifier since after bouncing through XML the JSON array is all keyed by strings, not integers.

To illustrate, run this little script:

var json = {'a':1,'b':'1'}

a is a number. b is a string.

After morphing it into XML, it would look like this:


we lose the type of those two variables - now they're both strings. Morphing this XML back into JSON would not produce identical JSON to that with which we started, since the type information is lost. The resulting JSON would resemble this:

var json = {'a':'1','b':'1'}

Yet couldn't fidelity be maintained by forcing your XML to store the typeof() whatever data point it's holding?
<a typeof="number">1</a>
<b typeof="string">1</b>

By doing this I'm admitting that while XML natively doesn't support typed objects, we can eXtend the language to retain a data type. It only requires that the recipient of the data (parser) understands the semantic meaning of a "@typeof" attribute, and interprets objects correctly, and forgives the fact that there is this reserved attribute which is should not really be interpreted as an attribute... it's a metadatum.

What about using namespaces?

I'm not quite ready to give up on homeomorphism yet, but the typeof problem is bothersome.

The ultimate goal of this exercise is to determine whether two functions, "toJSON()" and "toXML()" can exist, where the following two conditions are always met for any valid value of xml or json:

1) toJSON(toXML(json)) == json
2) toXML(toJSON(xml)) == xml

Next I'll take a closer look at arrays in JSON vs. node collections in XML


 2:44 pm on Mar 18, 2008 (gmt 0)

That's the thing about XML. As you pointed out in your birthday posting [webmasterworld.com], one of the reasons XML hasn't needed to be changed is because it is so "vanilla." It can be adapted to many different goals and contexts. JSON is also fairly flexible, but it was designed with a native context in mind: JavaScript Object Notation. That automatically puts restrictions on it. It is also why I like using it in JavaScript. It's "native." (almost. I get sick of using eval() to exercise JSON).

In your example, you have to do a bit more work to enforce types into XML, but is that bad? It seems to me that this is exactly how I would expect XML to work, and exactly why I would choose it as a data interchange language.


 4:48 pm on Mar 18, 2008 (gmt 0)

Reading the ENTIRE heading (including the second line) I'd say this is a post-hoc argument.

Even if XML and JSON are homeomorphic, it does NOT follow that one or the other is redundant.

They have different purposes. Each is optimized for a specific use. And, for 90% of cases, I'd submit that a third format is more appropriate - the simple CSV (comma-separated value) format.

JSON parses significantly faster than XML, owing to it's simple structure. Beyond that, on Javascript platforms, it parses even faster, as JSON from a trusted source can be simply EVALed (executed as Javascript code).

But both are overkill for the 90% of cases where all we need is a simple table structure. Rows and columns, with an identical, fixed data structure and data types for each row and column. Anybody here really use either JSON or XML for anything other than rows and columns?

I'd submit that CSV is a better choice for most applications. It's easy to create, easy to parse, and requires fewer bytes in transmission than either XML or JSON - where XML = bloated beyond sensibility, JSON = your standard fatty, CSV = slightly over-weight. Though (gzip compression largely plasters-over the differences.)


 6:06 pm on Mar 18, 2008 (gmt 0)

Anybody here really use either JSON or XML for anything other than rows and columns?

very often. So do you. (hint: "XHTML")

I'd only use CSV (and in practice I rarely, rarely do) if my data is tabular, and limited to 2 dimensions, like myvar[row][col]. JSON and XML easily hold multidimensional data like "myvar[x][y][z][d][c]" (JSON) or "/myvar/x/y/z/d/c" (XPATH). CSV can describe multidimensional data, but you'd need to JOIN CSV tables together using keys to produce multidimensional data. (no matter if it's joining multiple tables to each other, or joining a table to itself). If I'd be going to all that trouble, I'd rather use SQL.


 6:43 pm on Mar 18, 2008 (gmt 0)

Anybody here really use either JSON or XML for anything other than rows and columns?

I use nested hierarchies a lot. Both of these are good for that, and CSV/TSV (I prefer TSV) is useless for that.

I like XML because of Schema. I know that schema is a messy dog, but it's better than DTD, which is better than nothing.

I need a way to be explicit when describing data for SDK-type purposes. This needs a nested hierarchy, with the ability to repeat, etc.

JSON will also allow this, but XML has a semantic framework.

I also use XSLT, and that is only available for XML.


 6:52 pm on Mar 18, 2008 (gmt 0)

very often. So do you. (hint: "XHTML")

I don't use XHTML, for reasons quite well explained elsewhere here.


 7:36 pm on Mar 18, 2008 (gmt 0)


You're talking about the difference between fixed relational tables and flattening that out into a hierarchy. Since you can go from one, and back to the other, those as well are equivalent.

If datatype were important, it could done in a number of ways, in either XML or JSON. But the key notion is whether all the information is being retained in the translation. And if one asserts that datatype is part of the information, then there you go. It must not be disturbed.

There are system level apps to manipulate XML. But JSON is already script. So there's an advantage. Once either is in the infoset structure, the overhead of tag names and verbosity is gone from XML. And as someone said, if one gets the server to gzip assorted text files, as apparently is easy to do with Apache, perhaps some of the XML overhead goes away, too. There would be much repetition in XML that compression would cure.


 9:02 pm on Mar 18, 2008 (gmt 0)

if one gets the server to gzip assorted text files, as apparently is easy to do with Apache, perhaps some of the XML overhead goes away, too. There would be much repetition in XML that compression would cure

As an example, I just did a job for a client - an Intranet app.

The home page displays the status of 600 devices. (Yes, it's inappropriate to display the status of 600 devices on a home page...)

Originally, with HTML tables, it was an 800K byte download.

With gzip, 40K.

With XML and gzip, 24K (+ 2K for the page)

With JSON and gzip, 14K (+2K for the page)

gzip clearly is the big win here.


 12:26 am on Mar 19, 2008 (gmt 0)



var json = {'a':'1','b':'1'}

This has been suggested as possible bi directional mapping, with the limitation noted above.

Another limitation would be repeated tags eg


again extended mapping rules could handle this.

homeomorphism is more a property of the proposed mapping rules than the languages involved. I assume this term means bi directional lossless mapping ?


 11:48 pm on Mar 19, 2008 (gmt 0)

Another limitation would be repeated tags eg

how about:


 2:55 am on Mar 20, 2008 (gmt 0)

or other permutations

do you use an array
- always ( to save javascript checking case }
- if schema allows multiples
- if there are multiples

I think the bottom line is that you could produce a set of homeomorphism rules, but the devil is in the detail.


 12:16 pm on Mar 20, 2008 (gmt 0)

daveVk, I think you're right. Here's a start at those rules:

in XML:
* use namespaces or attributes to represent the datatype from JSON
* represent Arrays as multiple elements with the same name

in JSON:
* represent multiple elements of the same name as an array
* look for a reserved attribute to indicate the datatype (number, string, etc)

next example:
how do we change this into JSON, and back again?

<element attribute="attr">innertext</element>

DO we need JSON to know the difference between an attribute and a child node?
somehow this seems kinda kludgy:


 2:07 am on Mar 21, 2008 (gmt 0)


This would require considerable manipulation in js to be usefull, defeating the main reason for using json in the first place.

The "natural" ? way to pass a list of dogs in json of size 0, 1, 2 is


A Simplistic conversion to json would be


To achieve natural conversion the schema rule that dog can occur 0 or more times needs to be known.

<element attribute="attr">innertext</element>

sticking withs dogs that becomes

<dog gender="male">fido</dog>

natural mapping => {'name':'fido', 'gender':'male' )

DO we need JSON to know the difference between an attribute and a child node?

In my opinion an attribute is a short notation for a singled valued child. But that wont help if you are working to some xml standard. If the schema is known and used by mapping rules then NO, otherwise JSON becomes unnatural.


 2:46 pm on Apr 4, 2008 (gmt 0)

This thread is a few weeks old, so it's time to assert some conclusions.

JSON and XML are homeomorphic. In much the same way as a donut and a coffee cup are homeomorphic, you can change any XML into JSON, and you can change any JSON into XML. It would be possible to translate any XML into JSON and back, provided the translation obeys a strict mapping. Vice-versa for JSON->XML.

But just as you oughtn't try to drink out of a donut, you oughtn't think that homeomorphism implies identicality or congruence. There is a role for XML and a role for JSON, tasks at which each excel, and their own suites of diverse supporting technologies.

The rules for converting any XML into JSON losslessly results in some fugly but valid JSON, in which the JSON requires some extra structural elements implicitly but not textually present in the XML.

The rules for converting any JSON into XML losslessly results in some fugly but valid XML, in which the XML requires some extra structural elements implicitly but not textually present in the JSON.

Some of the intricacies of that mapping are discussed above, but the challenges are surmountable. I for one consider the matter closed.


 1:11 am on Apr 5, 2008 (gmt 0)

Agree with your conclusions, the fugly you mention can largely be removed if the mapping ( XML + XML Schema ) <--> JSON is used.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / XML Development
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved