JSON is strictly By-Value

JSON is an externalization format, not a programming language data type implementation. Why is this relevant?

JSON

JSON (http://www.json.org) is an ASCII representation of data. It provides base data type representations and a grammar about how to structure JSON structures properly.

As for terminology, here are some synonyms often used:

  • JSON “object”: document
  • JSON “members”: properties
  • JSON “pair”: property
  • JSON “string” in “pair”: property name
  • JSON “value” in “pair”: property value

JSON is Syntax

It is syntax only. There is not data type semantics attached to it; all programming languages that process JSON define their own interpretation of the meaning of the syntax JSON defines.

JSON Semantics as such is Undefined

The semantics of JSON is not defined by JSON itself. It is undefined by the standard. Programming languages and databases have to define their (!) semantics of JSON.

For example, the JSON standard does not make a statement about unique property names in a document. In JSON terminology, an object (aka document) can have several pairs (aka properties), each having a string and a value (aka property name and property value). JSON does not constrain the property names to be unique. According to JSON it would be valid for a document to have several properties with the same name.

JSON Semantics Implementation

How is this implemented? Do systems actually allow several properties with the same name in a document? Let’s look at two of those.

MongoDB (version 2.0.2): it is possible to save a document that has the same property twice:

MongoDB shell version: 2.0.2
connecting to: test
> use json
switched to db json
> db.test.save({"a":1, "a":2})
> db.test.findOne()
{ "_id" : ObjectId("5006d58e92cca1a32772df6a"), "a" : 2 }
>

So, clearly MongoDB does not complain about the fact that a property name is stated twice. However, it opts to make a selection itself and uses the second of the two and actually stores only one. Of course, this bears the question if MongoDB on input applies other modifications as well.

JsonLint (http://jsonlint.org/) exposes the same behavior. When validating {“a”:1, “a”:2} then two things are happening: it deems the input as ‘Valid JSON’, but then it modifies the input to only {“a”:2}.  This is bad in two ways. First, it is not clear which version is ‘valid’, and second, it is not a read-only lint! This means that if you paste in a JSON document, always check if JsonLint modified it; your document might not be valid, but JsonLint’s modification of it.

So it seems like that some systems actually do not support having two properties with the same name. But then I would suggest they flag this as error and not modify the document.

JSON and References

How does a JSON document refer to a second one? For example, an object representing a user referring to an object representing an address?

JSON does not have the notion of reference or pointer nor the notion of address or unique identifier (both necessary to make referencing work). In order to uniquely identify a document, the author of the document has to add a property that by convention is deemed to be a unique identifier (see e.g. MongoDB above, the database adds a ‘_id’ property). Secondly, a reference to such a unique identifier is a regular property that by convention has to be understood as being a reference. For example, the value 5006d58e92cca1a32772df6a above could be a value. By convention the interpretation of this property would know that the number is actually a unique identifier.

Side note: this is the same situation relational databases are in, they don’t have references either. But, in their case there is the concept of key and foreign key supervised by the database, both relying on data values. Except for the supervision functionality, JSON could be interpreted this way, too.

Coming back to the semantics for a moment. Assume a property “ref” is used to indicate that its value is a reference to another document. How is the absence of link established, meaning, the notion that a given object does not refer to another one? On possible way is to set “ref” to the value null. Alternatively “ref” could be left out. Either (or both ways) are possible, but that needs to be established.

JSON and Linked Structures

In summary, in order to establish linked structures using JSON, special properties have to be called out in order to establish addressing and referencing. JSON itself does not provide concepts for this. In addition, the JSON semantics and interpretation has to be established (either explicit by rules or implicit by the programming code).

Engineers and architectus be aware: JSON is an ASCII syntax for externalization, not a programming language data type system.

Advertisements

One thought on “JSON is strictly By-Value

  1. Pingback: JSON Type Extensions | r e a l – p r o g r a m m e r

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s