SQL for JSON Rationalization Part 13: Comparison Operators for JSON Object and JSON Array

As promised in a previous blog the discussion of comparison operators in context of JSON object and JSON array is following in this blog.

Comparison Operators = and <>

Equality and inequality are very straightforward comparison operators and are discussed first. Both are defined on the paths to properties as well as the JSON types of properties.

JSON array equality is (recursively) defined as follows. Two JSON arrays are equal if

  • They have the same number of array indexes starting at index 0
  • The value of each array element is equal for the same index in each of the two JSON arrays

Implicitly this means that order matters in the sense that array elements are compared according to their index position.

JSON object equality is (recursively) defined as follows. Two JSON objects are equal if

  • They have the exact same set of paths
  • The same path in each document leads to the same value and the same JSON type

Implicitly this means that the order of properties in JSON objects does not matter. It is “only” necessary that both objects have the exact same set of paths in any order.

There is no implicit type conversion supported in JSON SQL. The JSON string “15” is considered different from the JSON number 15 as both are of different JSON type.

Sample Data Set

To illustrate equality the following collection compColl is introduced:

{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"}]}
{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"},null]}
{"a":[15,true,{"p":"q"}],"b":["15",true,{"p":"q"}]}
{"x":{"r":15,"s":[true,false]},"z":{"r":15,"s":[true,false]}}
{"x":{"r":15,"s":[true,false]},"z":{"r":15,"s":[[true,false]]}}
{"x":{"r":15,"s":[true,false]},"z":{"r":"15","s":[true,false]}}
{"e":15,"f":[14,15,16]}
{"e":15,"f":[16,15]}
{"e":15}

Sample Queries

An example query for equal JSON arrays is as follows.

select {*} from compColl where a = b

returns

{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"}]}

An example query for equal JSON objects is as follows.

select {*} from compColl where x = z

returns

{"x":{"r":15,"s":[true,false]},"z":{"r":15,"s":[true,false]}}

Inequality is defined as negation of equality. The following queries demonstrate this:

select {*} from compColl where a <> b

returns

{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"},null]}
{"a":[15,true,{"p":"q"}],"b":["15",true,{"p":"q"}]}
select {*} from compColl where x <> z

returns

{"x":{"r":15,"s":[true,false]},"z":{"r":15,"s":[[true,false]]}}
{"x":{"r":15,"s":[true,false]},"z":{"r":"15","s":[true,false]}}

Undefined Comparison Operators <, >, <= and >=

Several comparison operators are undefined for JSON array and JSON object: <, >, <= and >=. If during query processing these comparison operators are used in combination with JSON array and/or JSON object, then the JSON documents will not participate in the comparison and will not add any result document to the result set.

The following query demonstrates that only like JSON types are compared:

select {*} from compColl where a.[0] <= b.[0]

returns

{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"}]}
{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"},null]}

The following query demonstrates that the <= comparison on JSON array is not defined:

select {*} from compColl where a <= b

returns the empty result.

The reason that those four comparison operators are not implemented is that not all JSON types can be compared with each other. For example, a JSON Boolean and a JSON number cannot be compared and consequently the comparison of JSON array or JSON object might fail and return an undefined result. In fact, across all JSON types, only JSON string can be compared to JSON string and JSON number to JSON number by >, <, <= and >=; all other JSON type cannot be compared with each other or other JSON types with these operators.

In context of query processing a failing comparison operator would not be desirable as the query would fail. As a consequence, JSON SQL does not implement the four comparison operations <, >, <= and >= on JSON array and JSON object (actually, on any JSON type except JSON number and JSON string).

However, a user can compare JSON arrays and JSON objects by comparing their array elements or properties individually where applicable or necessary. This is called user-defined comparison and is based on individual restrictions.

User-defined Comparison

A user defines comparison by means of predicates. This supports the user in comparing only those JSON array elements or JSON object properties that need to be compared for the use case at hand and make sense in this context: a user is not forced to compare all JSON array elements or all JSON object properties by can do so selectively.

select {*} from compColl where a.[2].p >= b.[2].p

returns

{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"}]}
{"a":[15,true,{"p":"q"}],"b":[15,true,{"p":"q"},null]}
{"a":[15,true,{"p":"q"}],"b":["15",true,{"p":"q"}]}

As the example shows, there are JSON array elements or JSON object properties that cannot be compared, e.g., a.[2] or b.[2] (except for equal and not-equal).

Since JSON SQL supports JSON documents with varying schema, a user can ensure the presence and JSON type of certain properties that are relevant for comparison with the predicates exists_path and is_of_type. The former ensures the presence, the latter type compatibility.

The following query shows all documents where the property f does not have a second array index but has a property e. If a query compares e with f.[1] then this query shows which documents will not participate in the query.

select {*} 
from   compColl 
where  exists_path e 
       and not exists_path f.[1]

returns

{"e":15}

The analogous is possible with the is_of_type predicate that would show which documents are excluded because of type incompatibility.

Missing Paths

As the example shows, only those JSON documents are participating in the comparison that fulfill the constraints wrt. existence and type compatibility.

A missing path does not falsify the result, as this example shows: the document is simply not participating in the comparison:

The query

select {*} from compColl where e = f.[1]

returns

{"e":15,"f":[14,15,16]}
{"e":15,"f":[16,15]}

Total Order

A total order across all document in a collection is only possible if each document can be compared with every other document in the same collection. With the predicates exists_path and is_of_type it is possible to determine if any documents will be left out of a comparison and hence the documents of a collection cannot be totally ordered with the given predicates.

Summary

Even though the operators >, <, >= and <= cannot be implemented for several JSON types, clients can implement partial comparison of documents with combinations of individual restrictions. The predicates exists_path and is_of_type allow to determine the set of documents included in (or excluded from) the query.

Go [ JSON | Relational ] SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

SQL for JSON Rationalization Part 8: Restriction – Objects

This installment reviews restriction in JSON SQL based on JSON object literals (all other JSON types except JSON array have been discussed in previous blogs).

JSON Object Notation

JSON SQL follows the JSON object notation as defined in the JSON standard. An empty JSON object is denoted as {} and a non-empty JSON object has one or more comma separated pairs (a pair is a tuple of string and JSON type separated by a colon – also referred to as property).

A JSON object literal is either an empty JSON object or a non-empty JSON object. A JSON object literal is not enclosed in quotes. The only JSON literal enclosed in quotes is JSON string. If a JSON object is enclosed in quotes then it is not a JSON object, but a JSON string.

Sample Document Set

The following document set is used in this blog and the documents are stored in a collection called “objectColl”.

select {*} from objectColl

results in

{"one": {"a": 1}}
{"one": "{\"a\": 1}"}
{"three": {"b": {"c": null}}}
{"four": {"x": 8, "y": 9}}
{"five": {}}

Restriction based on JSON Object Literal

Starting with the empty JSON object literal, the following two queries product the same result.

select {*} from objectColl where five = {}

and

select {*} from objectColl where {} = five

result in

{"five": {}}

In the following, queries show the JSON object literal on the right side of the operator, however, it can be on either side.

Operators = And <>

The operators = and <> are defined for a JSON object literal. JSON SQL regards two JSON objects as equal if both have the same pairs (recursively), in any order; and not equal otherwise.

The query (restriction using JSON object literal)

select {*} from objectColl where one = {"a": 1}

returns

{"one": {"a": 1}}

The query (restriction using JSON string literal(!))

select {*} from objectColl where one = '{"a": 1}'

returns

{"one": "{\"a\": 1}"}

A restriction can reach into the JSON object as well using the path notation. The query

select {*} from object Coll where three.b = {"c": null}

returns

{"three": {"b": {"c": null}}}

Operators <, >, <= And >=

The operators <, >, <= and >= could be defined recursively for convenience with some restrictions. For example, a JSON object could be considered less than another JSON object if both have the same pairs and if the values of the corresponding pairs are less than another.

However, JSON true, JSON false and JSON null would not be able to participate in the operator <, >, <= or >=, only JSON object, JSON array, JSON number and JSON string.

Those four operators are currently not directly implemented in JSON SQL since it is possible to achieve the same by writing a complex conjunctive restriction (details on this approach will be discussed in a subsequent blog as well as strategies of what to do if any of JSON true, JSON false or JSON null are present).

Canonical Interpretation

The order of the pairs inside a JSON object is not significant (according to the JSON standard). The query

select {*} from objectColl where four = {"y": 9, "x": 8}

therefore results in

{"four": {"x": 8, "y": 9}}

Summary

Restriction by JSON object is provided by JSON SQL without problem and the syntax extends the Relational SQL syntax naturally.

Go [ JSON | Relational ] SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.