SQL for JSON Rationalization Part 2: Is JSON SQL Superset of Relational SQL?

This is not really a precise question as it does not distinguish data model from query language – a distinction that it probably intended to make; so let’s rationalize both.

Table – Row

A table is defined in its basic form by columns and their types (domains), with columns having unique names within a table definition. The contents of a table are zero or more rows, whereby each row has to have for each column either a value (matching the column’s domain) or SQL NULL (indicating “unknown”); no deviation from this structure is possible, neither from the type system.

Collection – Document

In the JSON database world, JSON structures are called documents. Documents are in general grouped together in distinct groups and the grouping mechanism is a named collection. A collection can contain zero or more documents. Each document belongs to exactly one collection.

A collection in general does not impose a constraint on the document structure of its documents, so the whole range from all documents having the same structure to each document having a different structure is possible. That documents in a collection can have different structures is called “schema-free”, or more appropriately “implicit schema-per-document” instead of “schema-per-collection”.

Note: the structure of a document is the set of all paths from the root of the document to the leaves of the document. This definition allows a structural comparison: two documents having the same set of paths are structurally equivalent (or, are of the same schema if there was a schema definition).

Relational SQL

The term “Relational SQL” refers to SQL executing against tables. For example, “select * from abc” selects all rows from the table “abc”, which each row having values for each column of the table.

The query “select a, b from abc” would select all rows from the table “abc”, however, only the columns “a” and “b” within each row (projection).

JSON SQL

The term “JSON SQL” refers to SQL executing against collections. For example, in analogy to the Relational SQL query above, “select {*} from def” selects documents (indicated by the “{}”), and within each document all properties (indicated by “*”) from the collection “def”.

The query “select {a, b} from def” would select all objects from the collection “def”, however, only the properties “a” and “b”, in case these are present (more precisely, all paths starting with either “a” or “b” – this is one possible definition of projection in JSON SQL). If none of them is present in a document, then the result is the empty document. If only one property is present, the document only contains the one property.

For example, assume the collection “def” consisting of two documents:

{"a": 1, "b": 2, "c": 3}
{"d": 4, "e": 5, "f": 6}

The query “select {a, b} from def” would return

{"a": 1, "b": 2}
{}

JSON SQL with Table Output

What would the query “select a, b from def” return? This looks like a Relational SQL query, however, querying a collection of documents. The interpretation is to select the value of the property “a” and “b” from each document, however, not return a document, but a structure similar to a row with values only.

For example, “select a, b from def” applied to the above collection it would return

| a | b |
+---+---+
| 1 | 2 |
|   |   |

The first object would contribute the values “1” and “2”, the second object would not contribute values at all, so an empty row is returned (aka, a row where each column is empty and not SQL NULL as SQL NULL is not a valid value in the JSON structure definition).

Of course, the above example has scalar values for “a” and “b”; non-scalar values are supported as well, without restriction. It is therefore possible to have any JSON type in a column as value.

This small example shows that it is possible to not only return documents as a result of a JSON SQL query, but also tables (set of rows).

Collection with Schema Constraint

It is possible, at least conceptually, to enforce constraints on the documents of a collection with the interpretation that each document in the collection must satisfy the constraint. One possible constraint is that all document must have the exact same structure (schema) and only the values can be different.

For example, a constraint in context of collection “xyz” could be that all documents can only have top level properties with names “p” and “q” (paths of length 1), and the type of “p” must be a string, and the type of “q” must be an integer. Basically this constraint enforces a specific document schema on every document in the collection.

The collection “xyz” at a given point in time could have two documents:

{"p": "today", "q": 68}
{"p": "yesterday", "q": 71} 

Basically, this constraint enforces “flat” documents with scalar types, and this can be interpreted as the equivalent to rows of a table with two columns “p” and “q” of type string and integer.

Q.E.D.

With the appropriate constraint in place (aka, forcing “flat” documents of scalar property type), a collection can be interpreted as table and the documents as rows. Therefore it is possible to provide the equivalent of tables and the equivalent of Relational SQL in context of collections of documents supported by JSON SQL.

However, this is possible because of a constraint. Lifting that constraint allows documents to have any structure, and to have a representation beyond the table model. Therefore, JSON SQL on document collections is a superset of Relational SQL on tables.

Summary

In this blog it was shown that a constraint document environment can be interpreted as tables and JSON SQL can be used to accomplish the same functionality as Relational SQL. By lifting the constraint it follows that JSON SQL based on document collections is a superset of Relational SQL based on tables. The answer to the initial question is therefore “YES”.

The intriguing question now is, what will JSON SQL have to provide beyond Relational SQL in order to fully support the JSON document model? Future blogs will make an attempt to address this question in a structured and rationalized approach.

Go [ JSON | Relational ] SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Advertisements