SQL for JSON and Schema Support (Part 2): Where does the “Interesting” Code go?

The previous blog found that the “generic” indirect representation of JSON data is one way of supporting “schema-free” JSON objects or documents. Where does the “interesting” functional code live?

Indirect Representation

To recap, the indirect representation is a set of classes, functions, etc. (depending on programming language) that can manage JSON objects or JSON documents. All or most languages have libraries supporting JSON manipulation. For example, Jackson is such a library for Java.

These JSON libraries can manage any valid JSON structure, and they do not require a schema or the JSON objects being homogeneous. Two JSON objects representing the same concept like an order with different attributes (as shown in the previous blog) can be managed by such JSON libraries.

Structural Manipulation

Structural manipulation of JSON objects supports the addition, update or deletion of properties (members) as well as JSON array elements. Property values can be replaced, for example, a JSON string with a JSON object.

Through structural manipulation it is possible to change a JSON object as needed, when e.g. new details appear in form of additional properties.

Structural manipulation was demonstrated in a database context in the last blog: properties were added through the update statement. The same is possible in the indirect representation libraries in the various programming languages.

Computation

Structural manipulation is not the only code that is required as structural manipulation does not allow to compute any specific application semantics. For example, in context of orders, the total value of not yet shipped orders might be a value that needs to be computed.

In a database context this would be an aggregation query that sums up the amount of all orders that do not have the status of shipped.

In context of a programming language it would require a function that iterates through all orders and, like in the database aggregation approach, adds up the sum of those orders that have not shipped yet.

It probably would be implemented as a set of cooperating functions, like

DollarAmount getValueOfOrdersNotShipped(JSONArray orders)
boolean hasOrderShipped(JSONObject order)
DollarAmount getValueOfOrder(JSONObject order)

JSONArray as well as JSONObject are an example of an indirect representation holding order data as a JSON structure.

Note: of course, in the absence of a schema (which is assumed here), there is no assurance that the JSONArray or the JSONObject contain only orders or that the orders are homogeneous in structure. There has to be “trust” that this is indeed the case.

If validation is desired, and if no schema is available, then the only alternative is validating values in one or more JSON object properties. For example, order identifiers might be of a specific structure that uniquely identifies an identifier being an order identifier. This would require trust that the algorithms creating identifiers are correct.

Separation of Manipulation and Computation

The JSON libraries supporting the indirect representation are separate from the functional code (like the summing up of order values). The software architecture and design has to structure this separation and ideally ensures that all functions concerned with orders are “close” from a code structure or software architecture perspective.

There might be functions that can be reused across different concepts (like orders, returns, shipments, etc.), and they can be refactored out, of course, as in “normal” functional code.

Given the above rationalization, how does the absence of a schema come into the picture?

Implication of Schema Free JSON Objects

Since there is no schema, JSON objects can have a different structure even though they represent the same concepts. In context of orders,  let’s look at two use cases:

  • An order does not have a shipping status
  • An order does have a value but in a variety of data types

In a world without schema these are possible use cases and the functional code needs to check for those.

Addressing the first use case can be accomplished by checking for existence. Code can check if a property is present and react accordingly. In the above example, the code designer can choose to have hasOrderShipped() return false or throw an error in case there is no shipping status.

The second use case can be addressed by checking for the type of the value of the order. If possible, value transformations can be implemented in getValueOfOrder(), e.g., string to number; if it is not possible to transform, an error can be thrown.

Summary

In a schema free JSON context there are several aspects from a code perspective: functional code implementing application semantics is separate from the code that manages the structure of JSON objects. That separation must be carefully managed from an architectural perspective.

The functional code must anticipate non-homogeneous JSON objects and check for variation in order to be able to implement the functionality accurately.

But wait, there is more:-) The next blog will venture into more nuances.

Go [ JSON | Relational ] SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s