SQL for JSON Rationalization Part 1: Setting the Scene

Is it possible to technically rationalize SQL for JSON, aka, extend the SQL syntax and semantics to support JSON structures without flame wars during discussions? Let’s see.

Goal: Extend SQL to Support JSON Data Structures

The goal of the forthcoming blog series is to extend SQL with support for JSON data structures (aka, “JSON documents” or simply “documents”).

There is no consensus agreement in the community that this is the way to go at all; many dispute that extending SQL is the right direction and instead propose a different approach for querying JSON documents like specialized query APIs, Object-Relational-Mapper technology, representation of queries as JSON documents themselves, to name a few alternative proposals.

This is fine, as everybody is free to discuss their point of view in their favorite forum – no issues from my side:-)

However, this blog series is about extending SQL with support for JSON data structures and there is no justification necessary as this is a purely technical discussion. The attempt is made to impact SQL as little as possible (minimalist approach) while providing full JSON support.

Approaches to Extending SQL

There are different starting points of how to extend SQL to support JSON data structures:

  • start with a non-SQL approach and extend/adjust to SQL along the way
  • start with SQL in the first place

The latter approach is taken here. Given a SQL syntax (not necessarily a specific syntax from a specific standard or a specific system implementation) the discussion will revolve around SQL syntax and SQL semantics extension.

However, the discussion will not revolve around efficiency of SQL execution at this point and what the added cost of JSON data structure support is, if any. This blog series is laser-focused on syntax and semantics (only).

Direction From Here

The direction from here is (a) forward and (b) along the various SQL operations like for example

  • Projection
  • Selection
  • Cartesian Product
  • Join

Organizing the discussion along the SQL syntax and semantics is the framework for the discussion about the extension to support JSON data structures.

In addition, the JSON definition itself will be discussed and some missing pieces will be rationalized that are required in context of SQL. Operations or semantics that are not defined by the JSON standard are introduced as necessary, like for example

  • the starting index of an array (zero or one?)
  • the definition of normalization and equality
  • the difference between JSON null and that absence of a value (or absence of a pair)

Finally, the JSON terminology is incomplete and missing parts will be fit in for the purpose of this blog, for example

  • to name the string and value in a JSON pair
  • to name the elements of an array

This is not meant to be a proposal for refining the current JSON definition (in or outside of a standard). JSON is taken as is even though many wish to change it; however, in order to focus the discussion, no changes to JSON will be directly proposed.

Summary

Extending SQL “properly” to support JSON is going to be fun and I am looking forward to interesting discussions along the ride.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Advertisement