JSON structures do not have to follow a global schema in Oracle 12c; each JSON structure in a JSON column can have its own schema, with varying properties. This blog discusses the function JSON_EXISTS() that checks the existence of a property in a JSON structure and its application in SQL queries.
Sample Data Set
A sample data set for this blog is as follows. Note that not all JSON structures follow the same schema.
DROP TABLE people; CREATE TABLE people ( id NUMBER, person CLOB CONSTRAINT person_ensure_json CHECK (person IS JSON (STRICT WITH UNIQUE KEYS))); INSERT INTO people VALUES (1, '{ "name": "Bob", "city": "SF"}'); INSERT INTO people VALUES ( 2, '{ "name": "Jake", "city": "PA"}'); INSERT INTO people VALUES ( 3, '{ "name": "Alice", "city": "NYC"}'); INSERT INTO people VALUES ( 4, '{ "name": "Jenn", "city": "Tokyo"}'); INSERT INTO people VALUES ( 5, '{ "name": "Frank", "city": "Berlin"}'); INSERT INTO people VALUES ( 6, '{ "name": "Jup", "city": "Paris"}'); INSERT INTO people VALUES ( 7, '{ "name": "Jib"}');
DROP TABLE cities; CREATE TABLE cities ( id NUMBER, city CLOB CONSTRAINT city_ensure_json CHECK (city IS JSON (STRICT WITH UNIQUE KEYS))); INSERT INTO cities VALUES (101, '{"city": {"name": "SF", "population": 837442, "area": 121.4}, "state": "CA", "country": "US"}'); INSERT INTO cities VALUES (102, '{"city": {"name": "PA", "population": 64403, "area": 66.787}, "state": "CA", "country": "US"}'); INSERT INTO cities VALUES (103, '{"city": {"name": "NYC", "population": 8405837, "area": 783.8}, "state": "NY", "country": "US"}'); INSERT INTO cities VALUES (104, '{"city": {"name": "Tokyo", "population": 9071577, "area": 622.99}, "state": null, "country": "Japan"}'); INSERT INTO cities VALUES (105, '{"city": {"name": "Paris", "population": 10869000, "area": 2845}, "state": null, "country": "France"}'); INSERT INTO cities VALUES (106, '{"city": {"name": "Berlin", "population": 3517424, "area": 891.75}, "state": null, "country": "Germany"}'); INSERT INTO cities VALUES (107, '{"city": {"name": "Asilomar", "population": 100, "area": 2}}');
Absent JSON Property
The default SQL query behavior for absent JSON properties is discussed briefly first in order to establish the base semantics.
- Projection
- Absent JSON properties are returned as SQL NULL
- SELECT DISTINCT c.city.country FROM cities c;
- Absent properties can therefore be checked for by using SQL NULL in a predicate
- SELECT DISTINCT c.city.city.name, c.city.country FROM cities c WHERE c.city.country IS NULL;
- Selection
- Absent JSON properties do not match in predicates
- SELECT DISTINCT c.city.city.name FROM cities c WHERE c.city.country = ‘US’;
- Cartesian Product (join)
- Absent JSON properties do not contribute to the cartesian product
- SELECT cl.city.city.name, cr.city.city.name FROM cities cl, cities cr WHERE cl.city.country = cr.city.country;
The base semantics of absent JSON properties is consistent with their representation as SQL NULL.
Reasons for Absent JSON Properties
Why would JSON properties be present in some JSON structures of a JSON column, and absent from others? There are many possible legitimate reasons, some are discussed next:
- Life Cycle Implementations (Application Design)
- The life cycle of a concept (e.g., person) is implemented by adding property values over time. Properties are stored corresponding to the life cycle. For example, initially a person is added by name, with city and other information added later as they become known.
- Schema Evolution
- A property that was not required in the past (from a data model perspective) is added due to data model changes. Adding a new property to the already existing JSON structures might or might not be done and if not, different JSON documents have a different structure.
- Errors
- An error in the application system omits writing a property that should have been written. A real bug.
- Intentional Design Decision
- An application system was designed to only write properties that have values and to omit properties that do not have a value (yet). A corresponding design for deletion is that if a value of a property is deleted, the whole property is removed instead of being set to e.g. JSON null.
There is a design philosophy that emphasizes that even through JSON supports varying structures, they ideally should follow the same schema. This would mean that all JSON structures in a JSON column have the exact same set of properties. However, another design philosophy says that the dynamic nature of JSON structures should be taken advantage of avoiding the classical schema migration problem: a JSON structure should only contain those properties that make sense at this state of the structure life cycle.
Checking for Absent JSON Properties: JSON_EXISTS()
Since each JSON structure in a JSON column can have different schema it might be important to explicitly check for properties in SQL queries for query reliability and also documentation reasons.
Oracle 12c implements the function JSON_EXISTS() [https://docs.oracle.com/database/121/ADXDB/json.htm#ADXDB6246] that allows to check for existence of a property. For example:
SELECT DISTINCT c.city.city.name AS name FROM cities c WHERE json_exists(c.city, '$.country');
This query only selects those city names of cities that have their country specified.
More queries using JSON_EXISTS() follow:
CREATE OR REPLACE VIEW all_cities AS SELECT DISTINCT c.city.city.name AS name FROM cities c WHERE json_exists(c.city, '$.country');
SELECT p.person FROM people p, all_cities all_c WHERE p.person.city = all_c.name AND json_exists(p.person, '$.city');
Details on JSON_EXISTS()
JSON_EXISTS() supports error handling clauses. For example, when a query accesses a JSON column with incorrect JSON structures, a choice is to ignore rows with incorrect JSON structures, include them into the result, or throw an error when processing the query. An example follows:
DROP TABLE incorrect_json; CREATE TABLE incorrect_json ( id NUMBER, data CLOB); INSERT INTO incorrect_json VALUES (1, '{ "name": "item1", "value": "value1"}'); INSERT INTO incorrect_json VALUES (2, '{ "name": "item2", "value": "value2_broken_json}');
SELECT * FROM incorrect_json ij WHERE json_exists(ij.data, '$.value' false ON error);
SELECT * FROM incorrect_json ij WHERE json_exists(ij.data, '$.value' true ON error);
SELECT * FROM incorrect_json ij WHERE json_exists(ij.data, '$.value' error ON error);
JSON_EXISTS() expects a column that can store json structures as strings; when applied to a non-string column, an error indicates this:
SELECT * FROM incorrect_json ij WHERE json_exists(ij.id, '$.missing_property');
When applied to a non-JSON column, the error message looks different:
DROP TABLE incorrect_json2; CREATE TABLE incorrect_json2 ( id VARCHAR(10), data CLOB); INSERT INTO incorrect_json2 VALUES ('one', '{ "name": "item1", "value": "value1"}'); INSERT INTO incorrect_json2 VALUES ('two', '{ "name": "item2", "value": "value2_broken_json}');
SELECT * FROM incorrect_json2 ij WHERE json_exists(ij.id, '$' error ON error);
SELECT * FROM incorrect_json2 ij WHERE json_exists(ij.id, '$' false ON error);
SELECT * FROM incorrect_json2 ij WHERE json_exists(ij.id, '$' true ON error);
Summary
Varying schema in JSON structures can be explicitly explored using the JSON_EXISTS() function and therefore is a way to incorporate schema variation into SQL queries. This makes a powerful combination.
Go SQL!
Disclaimer
The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.