Oracle 12c Release 2: A Complete JSON Database

Today’s blog focuses on Oracle 12c Release 2. With this release of the database additional new interesting functionality was introduced: sharding and analytics support for JSON. Very interesting in context of JSON processing – future blogs will continue the JSON SQL discussion.

Oracle as a JSON Database

With Oracle 12c Release 2’s functional additions Oracle 12c is providing all major functional areas of functionality in context of JSON processing. Non-functional areas (backup, restore, replication, HA/DR support,  etc.) are supported as well, and there is no need to discuss those here.

The major areas of functionality in this context are

  • JSON OLTP Processing. Oracle 12c supports general OLTP (online transaction processing) functionality. JSON documents are stored in tables and accessed through SQL (full DML support).
  • JSON Analytics. Oracle 12c supports JSON analytics processing in the Oracle In-Memory option providing columnar representation and columnar processing.
  • JSON Sharding. Oracle 12c supports managed sharding of data in tables of independently running databases (shards); up to 1000 shards currently.

Details of JSON OLTP, Analytics and Sharding

The following presentation contains a first level of details for the three areas above. I gave it during an Oracle Code event (https://go.oracle.com/oraclecode) in New York (https://developer.oracle.com/code/newyork) earlier this month. References to more detailed Oracle 12c documentation are provided in the presentation itself.

The presentation can be downloaded from here: Oracle Code NYC Presentation.

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Advertisement

Oracle 12c – In-Memory Option (Part 3): JSON Support

It is quite natural to view Oracle’s In-Memory Option in context of the relational model; however, the In-Memory Option supports the JSON model at the same time as well.

Oracle JSON Support

As shown earlier in this blog, the Oracle 12c database supports JSON natively [http://docs.oracle.com/database/121/ADXDB/json.htm#ADXDB6246] and incorporates JSON access into SQL so that JSON structures can be accessed through SQL directly. In addition, in a single SQL statement, JSON structures as well as relational tables can be accessed at the same time. Please see the documentation and the other blog entries for more details.

Oracle In-Memory Option

The In-Memory Option [http://www.oracle.com/us/corporate/features/database-in-memory-option/] was introduced in the previous two blog entries and those discussed a relational example as well as queries that allow to introspect the meta data and status of the In-Memory Option processing. It gave an overview as well as the rationale why using the In-Memory Option for analytic queries is advantageous over regular relational processing (columnar representation of data in main memory).

Combination: JSON processing using In-Memory Option

Analytics in context of JSON structures would benefit from In-Memory Option support as well. Many applications based on JSON structures usually have to support some form of analytics. The In-Memory Option supports aggregation functionality on JSON structures as well as on relational tables.

In the following, an example is introduced that creates a table containing a JSON structure, enables it for In-Memory Option support and shows a few simple analytics queries.

Example Table with JSON Structure

The following declares a sample table with a column containing a JSON structure.

DROP TABLE js_players;
CREATE TABLE js_players
 (
   player_id NUMBER NOT NULL,
   player_name VARCHAR(255),
   games VARCHAR(4000) 
    CONSTRAINT games_ensure_json
    CHECK (games IS JSON (STRICT WITH UNIQUE KEYS)));

Table Population

In order to have a large data set the following block creates rows containing JSON structures and inserting them into the above declared table.

DECLARE
  counter NUMBER;
  player_name VARCHAR(128);
  games_value VARCHAR (256);
BEGIN
  counter := 0;
  LOOP
    counter := counter + 1;
    player_name := 'Divvon' || '_' || counter;
    games_value := '{ "games":
      [{"name":"tic-tac-toe", 
        "points":' || counter *10 || '},
       {"name":"one-two-three", 
        "points":' || counter * 100 || '}
      ]}';
    INSERT INTO js_players VALUES
      (counter, player_name, games_value);
  EXIT
    WHEN counter = 1000000;
  END LOOP;
  COMMIT;
  RETURN;
END;

An example JSON document looks like this:

{"games":
  [{"name":"tic-tac-toe", "points":750},
   {"name":"one-two-three", "points":7500}
  ]}

Enabling In-Memory Option

This statement enables the In-Memory Option for the table containing JSON structures:

ALTER TABLE js_players inmemory priority critical;

Aggregation Queries

The following queries show aggregation over elements of the JSON structures. Each query extracts all ‘points’ values from the JSON structure (since every document might have several ‘points’ fields in the array of ‘games’) and makes is accessible as ‘pts.points’. Then it aggregates over this structure. The first query aggregates over all rows, whereas the second query is selecting only a few rows based on ‘player_name’.

SELECT DISTINCT p.player_name,
       SUM(pts.points) sum_points,
       MIN(pts.points) min_points,
       MAX(pts.points) max_points,
       AVG(pts.points) avg_points,
       COUNT(*)
FROM js_players p,
     json_table(p.games, 
                '$.games[*]' COLUMNS (points VARCHAR2(32 CHAR) PATH '$.points')) 
         AS pts
GROUP BY p.player_name
ORDER BY sum_points DESC;

01_blog

SELECT DISTINCT p.player_name,
       SUM(pts.points) sum_points,
       MIN(pts.points) min_points,
       MAX(pts.points) max_points,
       AVG(pts.points) avg_points,
       COUNT(*)
FROM js_players p,
     json_table(p.games, 
                '$.games[*]' COLUMNS (points VARCHAR2(32 CHAR) PATH '$.points')) 
       AS pts
WHERE p.player_name = 'Divvon_204'
      OR p.player_name = 'Divvon_206'
GROUP BY p.player_name
ORDER BY sum_points DESC;

02_blog

Summary

Oracle’s In-Memory Option, because it not only supports the relational model, but also the JSON model, is an interesting alternative to analytics in context of JSON data sets. This blog has shown an example of how to combine JSON structures and the In-Memory Option in order to be able to run analytics queries.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – In-Memory Option (Part 2)

Part 1 was introducing an Oracle In-Memory Option example; in Part 2 I’ll introduce a few queries that show meta data and statistics in context of the Oracle In-Memory Option.

Table Status

One interesting set of metadata is the table’s in-memory status as well as the compression and loading priority. A table with enabled in-memory option is called “in-memory table” in the following for short. The following query retrieves these metadata for the in-memory table “IM_PLAYERS”:

SELECT table_name,
       inmemory,
       inmemory_compression,
       inmemory_priority
FROM dba_tables
WHERE table_name = 'IM_PLAYERS';

01

Loading of in-memory tables takes some time on initial load and their load status is important. Only when loaded the full capability of the in-memory functionality is available. The following query shows the load status by retrieving the number of bytes not populated into the in-memory area yet:

SELECT v.segment_name name,
       v.populate_status populate_status,
       v.bytes_not_populated bytes_not_populated
FROM v$im_segments v
ORDER BY 1;

02

The size of an in-memory table is important in order to determine if the memory allocation for the in-memory option is sufficient. The following query fetches various sizes:

SELECT v.owner,
       v.segment_name,
       v.populate_status,
       ROUND(v.bytes / 1000 / 1000, 2) 
           AS size_in_mb,
       ROUND(v.inmemory_size / 1000 / 1000, 2) 
           AS inmemory_size_in_mb,
       ROUND(v.inmemory_size / v.bytes, 2) * 100 
           AS im_size_percent
FROM v$im_segments v;

03

Basic Usage

When running queries against in-memory tables the in-memory option might or might not be triggered and used depending on the query optimizer’s decisions. The following two statements executed together show if the in-memory option was used:

EXEC dbms_feature_usage_internal.
         exec_db_usage_sampling(SYSDATE);
SELECT ul.name,
       ul.detected_usages
FROM dba_feature_usage_statistics ul
WHERE ul.version =
          (SELECT MAX(u2.version)
           FROM dba_feature_usage_statistics u2
           WHERE u2.name = ul.name
                 AND ul.name LIKE 'In-%');

04

A query to an in-memory table advances the value of detected_usages if the in-memory representation of that table was in fact accessed.

Statistics

If a query triggers the use of the columnar representation managed by the in-memory option it is important to understand the specific statistics to see how much optimization took place. The following query retrieves some of all available statistics:

SELECT t.display_name,
       s.value,
       s.sid,
       t.statistic#
FROM v$statname t, v$mystat s
WHERE t.display_name IN ('IM scan CUs columns accessed', 
                         'IM scan segments minmax eligible', 
                         'IM scan CUs pruned')
      AND s.statistic# = t.statistic#
ORDER BY s.sid DESC;

05

Being able to retrieve the statistics allows now to see the statistics before and after query execution. In the next section a selective as well as aggregate query is run and the statistics examined before and afterwards.

Example

The first example is a selective query that retrieves all columns of two players (rows).

SELECT * 
FROM cb.im_players 
WHERE player_id = 10 OR player_id = 20;

06

First, the statistics is shown before the query is executed and then the statistics after the query is executed.

05

07

The second example is a non-selective query that aggregates over all rows.

SELECT player_name,
       COUNT(*),
       MIN(player_score),
       MAX(player_score),
       AVG(player_score),
       SUM(player_score)
FROM cb.im_players
GROUP BY player_name;

08

Again, first the statistics before and after the query execution is shown.

07

09

Summary

This blog introduced a series of queries that allow to retrieve metadata as well as statistics on query execution in context of the Oracle In-Memory Option. The nice part is that these are regular SQL queries that can be run manually, but also as part of development or operations tools. It is therefore possible to examine the use of the in-memory tables and their behavior in a running application.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – In-Memory Option (Part 1)

The name ‘In-Memory Option’ does not mean Oracle turned into a general main memory database or has a new option to do so; it means something else entirely.

Aggregation

Columnar data representation is geared towards aggregation of columns in relational databases. In order to make aggregation fast, the rows of a column should be ‘near’ to each other in terms of memory layout; the best would be if the rows of the column would be next to each other. Access for aggregation is very efficient with this layout. A database implementing this approach is called a columnar database (http://en.wikipedia.org/wiki/Column-oriented_DBMS).

Specialized databases exist that structure the data on disk as well as in memory in a columnar format. Once the data is in such a database, aggregation can be fast. However, the data has to be in the database and chances are that the same data is in a separate OLTP database also (a second copy). This then means that a data integration problem has to be solved in order to ensure that data that is managed in the OLTP database is also available (with sufficiently minor delay) in the columnar database (or columnar store). And, since it changes in the OLTP database, the integration effort is an ongoing process.

In-Memory Option

Oracle 12c’s In-Memory Option [http://www.oracle.com/us/corporate/features/database-in-memory-option/] combines the columnar storage with OLTP storage in the same database. The columnar storage layout is only implemented in main memory, leaving the disk layout as it is. This means that OLTP and columnar access is possible in the same database, concurrently, and the data is always 100% transactionally consistent, as no delayed data integration has to take place. The ‘artificial’ separation of database functionality into separate databases to satisfy aggregation processing requirements is not necessary anymore in such a database management system.

Example

Since Oracle 12c combines the relational and the columnar representation, using the columnar functionality is fully declarative without any operational management overhead.

In the following a small example shows how to make use of the Oracle 12c In-Memory Option concurrently to the regular OLTP load.

Assume a regular table that has been in place for a while and is accessed by a regular application (OLTP use case). This table manages data and is changed according to the application logic.

DROP TABLE im_players;
CREATE TABLE im_players
  (
   player_id NUMBER,
   player_name VARCHAR2(128),
   player_score NUMBER);

Here is a way to populate the table with random content for testing purposes:

DECLARE
  player_id NUMBER;
  player_score NUMBER;
  player_name VARCHAR(128);
BEGIN
  player_id := 0;
  LOOP
    player_id := player_id + 1;
    player_name := 'player_' || player_id;
    player_score := player_id * 10;
    INSERT INTO im_players VALUES
      (player_id, player_name, player_score);
    player_score := player_id * 100;
    INSERT INTO im_players VALUES
      (player_id, player_name, player_score);
    player_score := player_id * 1000;
    INSERT INTO im_players VALUES
      (player_id, player_name, player_score);
  EXIT
    WHEN player_id = 1000000;
  END LOOP;
  COMMIT;
  RETURN;
END;

A new requirement has to be fulfilled based on business needs. The requirement is to analyze the existing table regularly while it is being used by the OLTP application. In the following this requirement is implemented with the Oracle 12c In-Memory Option.

First, memory has to be set aside for the columnar representation of the data in main memory:

ALTER system SET inmemory_size = 300M scope=spfile;

Restarting the database enables this change and this query confirms the setting:

SELECT name, value FROM v$sga;

01

Now the database has a main memory area (In-Memory Area) set aside for the columnar representation of relational tables.

Next, the existing table definition is modified to additionally be represented in columnar representation:

ALTER TABLE im_players inmemory priority critical;

There are different options that indicate how critical it is to have the rows in columnar format, how to compress depending on the access model, and actually which columns should be in columnar format as not all columns are necessarily required.

The alter statement above indicates that it is critical for the data to be in columnar format. The database will diligently load the data into the columnar memory section. Compression is enabled for low query access. All columns are enabled, based on the use case.

The following query shows how much data is not yet populated. Loading data is a process that takes time and the query supports checking the load status. Once the loading is completed, zero bytes are not yet populated.

SELECT 
  v.segment_name name,
  v.populate_status populate_status,
  v.bytes_not_populated bytes_not_populated
FROM v$im_segments v
ORDER BY 1;

02

The following query performs an aggregation.

SELECT
  player_name,
  COUNT(*),
  MIN(player_score),
  MAX(player_score),
  AVG(player_score),
  SUM(player_score)
FROM im_players
GROUP BY player_name;

??03

03

The query plan of this query shows that the columnar representation is being accessed.

----------------------------------------------------------------------------------------
| Id | Operation                  | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT           |            | 3025K | 227M  | 216 (38)   | 00:00:01 |
|  1 | HASH GROUP BY              |            | 3025K | 227M  | 216 (38)   | 00:00:01 |
|  2 | TABLE ACCESS INMEMORY FULL | IM_PLAYERS | 3025K | 227M  | 141 (5)    | 00:00:01 |
----------------------------------------------------------------------------------------

And this query shows access metrics indicating that the query actually executed against the columnar representation. The values after executing the aggregation query are different from the values before the aggregation.

Note on SQL

An important aspect of the Oracle 12c In-Memory Option is that the SQL specifying the required aggregation is regular SQL and no new or different language constructs had to be introduced. The underlying reason is that the query optimizer has sufficient knowledge to make the appropriate access choices based on the SQL presented and does not require hints from users. This is a huge engineering benefit since no new or different language has to be learned to make use of the columnar representation.

Summary

This was a super-brief introduction into the Oracle 12c In-Memory Option. The documentation has of course a lot more background [http://docs.oracle.com/database/121/ADMIN/memory.htm#ADMIN14257]. In addition, a nice blog series exists that introduces the various aspects step-by-step [https://blogs.oracle.com/In-Memory/entry/getting_started_with_oracle_database].

From a data management aspect an optimized representation targeted towards aggregation is enabled declaratively in context of the same database that holds the OLTP data. This is a huge benefit as one database supports the mixed workload and any data integration setup between different databases becomes unnecessary. With significantly increasing main memory sizes this option allows freeing up engineering cycles as well as operational resources while increasing data consistency and processing performance.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – SQL for JSON (Part 7): Creating JSON from Relational Data

Up to now the examples of previous blogs inserted JSON data into JSON enabled columns of tables. This blog shows how JSON data can be created or derived automatically, and, more importantly, how relational data can be transformed and represented automatically as hierarchical JSON data.

Virtual Columns

A virtual column of a table in Oracle is defined by referring to a function invocation. The value of the column is the function invocation result. Parameters (column values of the same table) can be passed to the function and so the functionality of the function can be dependent on the data content of the row.

A simple example is shown in the following. A virtual column is created that contains a name – value pair based on a non-virtual column of the same table (the function is string concatenation).

DROP TABLE vc;
CREATE TABLE vc
  (
   id NUMBER NOT NULL,
   name VARCHAR(255),
   value generated always AS (
     '"name"' || ':' || '"' || name || '"' ));
INSERT INTO vc (id, name) VALUES (1, 'abc');
INSERT INTO vc (id, name) VALUES (2, 'def');
SELECT * FROM vc;

1

Value Column with JSON Structure

The following example shows a similar example, however, in this case the virtual column is a JSON column and the contents is a JSON object containing the name – value pair. The table definition ensures that the column is a JSON column and that it can be accessed with the JSON functions.

DROP TABLE vc;
CREATE TABLE vc
 (
   id NUMBER NOT NULL,
   name VARCHAR(255),
   value generated always AS (
     '{' || '"name"' || ':' || '"' || name || '"' || '}') 
     virtual CONSTRAINT value_ensure_json 
     CHECK (value IS JSON (STRICT WITH UNIQUE KEYS)) );
INSERT INTO vc (id, name) VALUES (1, 'abc');
INSERT INTO vc (id, name) VALUES (2, 'def');
SELECT * FROM vc;

2

Separate Function creating JSON Structure

Functions can be created separately as first class citizens in the database (for the purpose of proper engineering practices as well as functionality reuse); they do not necessarily have to be inline invocations. The following example creates a separate function that creates a JSON object containing a name – value pair.

CREATE OR REPLACE FUNCTION 
  to_JSON(in_name IN VARCHAR2,
          in_value IN VARCHAR2)
  RETURN VARCHAR2 deterministic
  IS
    JSON_value VARCHAR(255);
  BEGIN
    JSON_value := 
      '{' || '"' || in_name || '"' || ':' || 
             '"' || in_value || '"' || '}';
  RETURN JSON_value;
END;
DROP TABLE vc;
CREATE TABLE vc
 (
   id NUMBER NOT NULL,
   name VARCHAR(255),
   value generated always AS (
     to_JSON('name', name)) 
     virtual CONSTRAINT value_ensure_json 
     CHECK (value IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO vc (id, name) VALUES (1, 'abc');
INSERT INTO vc (id, name) VALUES (2, 'def');
SELECT * FROM vc;

3

Function with Selection

Functions have the full range of SQL available and can execute queries in their implementation. The following example queries a second table and creates a JSON object from the result. Note that the query depends on the input parameter. 

CREATE OR REPLACE FUNCTION 
  from_dep(in_value IN VARCHAR2)
  RETURN VARCHAR2 deterministic
IS
  JSON_value VARCHAR(255);
  dep_value  VARCHAR(255);
BEGIN
  EXECUTE IMMEDIATE 'SELECT ' || ' adr ' || 
                    ' FROM ' || ' dep ' 
                    INTO dep_value;
  JSON_value := '{' || '"' || in_value || '"' || ':' || 
                       '"' || dep_value || '"' || '}';
  RETURN JSON_value;
END;
DROP TABLE vc;
CREATE TABLE vc
 (
   id NUMBER NOT NULL,
   name VARCHAR(255),
   value generated always AS 
     (from_dep(name)) 
     virtual CONSTRAINT value_ensure_json 
     CHECK (value IS JSON (STRICT WITH UNIQUE KEYS)) );
DROP TABLE dep;
CREATE TABLE dep
 (
    adr VARCHAR(255));
INSERT INTO dep VALUES ('yahoo');
INSERT INTO vc (id, name) VALUES (1, 'abc');
INSERT INTO vc (id, name) VALUES (2, 'def');
SELECT * FROM vc;

4

Hierarchical JSON Column

The last example shows how a function can create complex hierarchical JSON structures. This allows for denormalizing a relational model into a JSON document representation as needed. This is the use case where a client requires a self-contained JSON object that contains all necessary data so that the client does not have to perform repeated queries to obtain all data.

CREATE OR REPLACE FUNCTION 
  from_dep(v_id IN NUMBER)
  RETURN VARCHAR2 deterministic
IS
  JSON_value VARCHAR(255);
  dep_value  VARCHAR(255);
  CURSOR dep_cur
    IS
      SELECT * FROM dep d WHERE d.vc = v_id;
  l_dep_adr dep%ROWTYPE;
  is_first NUMBER;
BEGIN
  is_first := 0;
  JSON_value := '[';
  OPEN dep_cur;
  LOOP
    FETCH dep_cur INTO l_dep_adr;
    EXIT
      WHEN dep_cur%NOTFOUND;
    IF (is_first = 0) 
    THEN
      JSON_value := JSON_value || 
        '{' || '"' || 'adr' || '"' || ':' || 
               '"' || l_dep_adr.adr || '"' || '}';
      is_first := 1;
    ELSE
      JSON_value := JSON_value || ',' || 
        '{' || '"' || 'adr' || '"' || ':' || 
               '"' || l_dep_adr.adr || '"' || '}';
    END IF;
  END LOOP;
  CLOSE dep_cur;
  JSON_value := JSON_value || ']';
  RETURN JSON_value;
END;
DROP TABLE vc;
CREATE TABLE vc
 (
   id NUMBER NOT NULL,
   name VARCHAR(255),
   value generated always AS 
     (from_dep(id)) 
     virtual CONSTRAINT value_ensure_json 
     CHECK (value IS JSON (STRICT WITH UNIQUE KEYS)) );
DROP TABLE dep;
CREATE TABLE dep
 (
   vc NUMBER, adr VARCHAR(255));
INSERT INTO vc (id, name) VALUES (1, 'abc');
INSERT INTO vc (id, name) VALUES (2, 'def');
INSERT INTO dep VALUES (1, 'yahoo');
INSERT INTO dep VALUES (2, 'yahaa');
INSERT INTO dep VALUES (2, 'yahee');
SELECT * FROM vc;

5

Summary

Virtual columns are an interesting mechanism when it comes to denormalizing a relational data model into a hierarchical JSON document model. This blog gave some examples upon which more complex functionality can be built. It shows that Oracle 12c can not only query JSON, but also generate JSON structures from relational data supporting a large array of use cases.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – SQL for JSON (Part 6): Un-nesting

JSON supports nested structures and the particular shape of the nesting is decided by the application systems or by a data modeler for different reasons. How can nested JSON structures be queried with SQL?

Nested JSON Structures

Nested JSON structures are based on two types: JSON objects and JSON arrays. Objects can have several members; each member is either of scalar type, object type or array type. An array can contain scalar values, objects or arrays. This nesting of objects with objects and arrays provides the mechanism to build (theoretically unlimited deep) nested structures.

Nested structures are used in most cases to model a part-of relationship or an association relationship. As the nested structures are trees, there is a 1:n relationship between the higher and the lower nodes (without cycles as JSON types do not include reference types).

Folklore says that nesting is sufficient to ‘avoid’ joins in context of JSON databases, but professionals know a lot better.

In the following, an example data set is introduced and a few queries afterwards that specifically deal with nested structures in order to query those in context of SQL. A key JSON function in context of Oracle 12c is json_table() [https://docs.oracle.com/database/121/ADXDB/json.htm#ADXDB6246] combined with un-nesting.

Example Data Set

DROP TABLE reseller;
CREATE TABLE reseller
  (
    reseller_id   NUMBER NOT NULL,
    reseller_name VARCHAR(255),
    suppliers CLOB CONSTRAINT suppliers_ensure_json 
    CHECK (suppliers IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO reseller VALUES
(1, 'abc corp.',
  '{ "suppliers": [
    {"supplier_name": "Sign Corp", "parts": [
      {"partNo": 1, "price": 100, "quantity": 2},
      {"partNo": 2, "price": 200, "quantity": 4}]},
    {"supplier_name": "Stand Corp.", "parts": [
      {"partNo": 10, "price": 1000, "quantity": 6},
      {"partNo": 11, "price": 1100, "quantity": 8},
      {"partNo": 12, "price": 1200, "quantity": 10}]}]}');
INSERT INTO reseller VALUES
(2, 'xyz corp.',
  '{ "suppliers": [
    {"supplier_name": "Sign Corp", "parts": [
      {"partNo": 1, "price": 100, "quantity": 12},
      {"partNo": 3, "price": 300, "quantity": 14}]},
    {"supplier_name": "Stand Corp.", "parts": [
      {"partNo": 10, "price": 1000, "quantity": 16},
      {"partNo": 11, "price": 1100, "quantity": 18},
      {"partNo": 12, "price": 1200, "quantity": 20},
      {"partNo": 13, "price": 1300, "quantity": 22},
      {"partNo": 14, "price": 1400, "quantity": 24}]}]}');

Simply Query: Listing all Suppliers

The first query is going to list the distinct set of supplier names. Supplier names are part of objects inside arrays. Therefore the query will reach into the tree and iterate over the suppliers (using the [*] notation that says to iterate over all array elements) in order to pick out the supplier name (accomplished by the PATH expression). A column is declared that captures all supplier names and a distinct projection ensures that each supplier name is included only once into the result set.

SELECT DISTINCT s.*
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name')) AS s;

Result:
1
Changing [*] to [1] only retrieves the first supplier and since the first supplier is the same across the data set of JSON documents only one supplier name appears in the result set.

SELECT DISTINCT s.*
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[1]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name')) AS s;

Result:
2

Deeper Reach: Listing all Parts and Prices

The next query lists all parts and their prices. It creates two columns, one for the part number and one for the price. However, since each supplier can supply more than one part, the query has to iterate over two arrays (using the [*] notation), once over suppliers, and once over all parts within each supplier. The result is ordered in order to provide a consistent layout. 

SELECT DISTINCT s.*
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*].parts[*]'
                COLUMNS (part_no NUMBER
                         PATH '$.partNo',
                         part_price NUMBER
                         PATH '$.price')) AS s
ORDER BY s.part_no;

Result:
3
It would be more interesting to see the supplier name also for each of the parts. This is where un-nesting is applied.

Un-nesting: Listing all Parts and Prices for each Supplier and for each Reseller

The next query lists the parts and their prices for each supplier, and for each reseller. This allows to see the reseller, the suppliers they use, as well as the parts that are supplied.

SELECT r.reseller_name,
       s.*
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name',
                         NESTED path '$.parts[*]'
                         COLUMNS (part_no VARCHAR2(32 CHAR)
                                  PATH '$.partNo',
                                  part_price NUMBER
                                  PATH '$.price'))) AS s
ORDER BY r.reseller_name,
         s.supplier_name,
         s.part_no;

Result:
4
Note the keyword NESTED that transforms a hierarchical relationship into a relational representation by repeating the column value on the higher level. In this example, for each part_no and part_price, the supplier_name is provided.

Computation on Un-nested JSON Data

The following query shows the inventory and inventory value for each reseller and supplier. The value is dynamically computed by the query.

SELECT r.reseller_name,
       s.*,
       s.part_price * s.part_quantity AS inventory_value
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name',
                         NESTED path '$.parts[*]'
                         COLUMNS (part_no VARCHAR2(32 CHAR)
                                  PATH '$.partNo',
                                  part_price NUMBER
                                  PATH '$.price',
                                  part_quantity NUMBER
                                  PATH '$.quantity'))) AS s
ORDER BY r.reseller_name,
         s.supplier_name,
         s.part_no;

Result:
5

Aggregation on Un-nested JSON Data

The following query shows the total inventory of each reseller by aggregation in the query.

SELECT r.reseller_name,
       SUM(s.part_price * s.part_quantity) AS inventory_value
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name',
                         NESTED path '$.parts[*]'
                         COLUMNS (part_no VARCHAR2(32 CHAR)
                                  PATH '$.partNo',
                                  part_price NUMBER
                                  PATH '$.price',
                                  part_quantity NUMBER
                                  PATH '$.quantity'))) AS s
GROUP BY r.reseller_name
ORDER BY r.reseller_name;

Result:
6

Changing to Seller Perspective

The following query shows the quantity and sales of each type of part to resellers. The perspective changes from a reseller perspective so far to a seller perspective. In SQL this is straight-forward, however, in context of nested JSON structures this would mean processing of the 1:n relationship in the opposite direction.

SELECT s.supplier_name,
       r.reseller_name,
       s.part_no,
       s.part_price,
       s.part_quantity,
       s.part_price * s.part_quantity AS sales_value
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name',
                         NESTED path '$.parts[*]'
                         COLUMNS (part_no VARCHAR2(32 CHAR)
                                  PATH '$.partNo',
                                  part_price NUMBER
                                  PATH '$.price',
                                  part_quantity NUMBER 
                                  PATH '$.quantity'))) AS s
ORDER BY s.supplier_name,
         r.reseller_name,
         s.part_no;

Result:
7

Aggregation of Seller Data

And, finally, this query shows the total sales by each supplier to each reseller.

SELECT s.supplier_name,
       r.reseller_name,
       SUM(s.part_price * s.part_quantity) AS sales_value
FROM reseller r,
     json_table(r.suppliers,
                '$.suppliers[*]'
                COLUMNS (supplier_name VARCHAR2(32 CHAR)
                         PATH '$.supplier_name',
                         NESTED path '$.parts[*]'
                         COLUMNS (part_no VARCHAR2(32 CHAR)
                                  PATH '$.partNo',
                                  part_price NUMBER
                                  PATH '$.price',
                                  part_quantity NUMBER
                                  PATH '$.quantity'))) AS s
GROUP BY s.supplier_name,
         r.reseller_name
ORDER BY s.supplier_name;

Result:
8

Summary

As shown, the JSON function json_table() supports transforming hierarchically structured data into relationally structured data so that it is possible to execute expressive SQL queries, not only retrieving data, but also aggregating data on different levels of the hierarchy.

Once the hierarchical structure is un-nested, the data can be queried from different perspectives, aka, the 1:n relationship in the nested structure can be processed in both directions.

This is another example of the quite powerful and convenient integration of JSON into the world of relational databases and SQL. From an application development perspective this means that JSON documents can be used and at the same time made accessible to the full power of SQL.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – SQL for JSON (Part 5): Property Existence

JSON structures do not have to follow a global schema in Oracle 12c; each JSON structure in a JSON column can have its own schema, with varying properties. This blog discusses the function JSON_EXISTS() that checks the existence of a property in a JSON structure and its application in SQL queries.

Sample Data Set

A sample data set for this blog is as follows. Note that not all JSON structures follow the same schema.

DROP TABLE people;
CREATE TABLE people
  (
    id NUMBER,
    person CLOB CONSTRAINT person_ensure_json 
    CHECK (person IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO people VALUES
(1, '{ "name": "Bob", "city": "SF"}');
INSERT INTO people VALUES
( 2, '{ "name": "Jake", "city": "PA"}');
INSERT INTO people VALUES
( 3, '{ "name": "Alice", "city": "NYC"}');
INSERT INTO people VALUES
( 4, '{ "name": "Jenn", "city": "Tokyo"}');
INSERT INTO people VALUES
( 5, '{ "name": "Frank", "city": "Berlin"}');
INSERT INTO people VALUES
( 6, '{ "name": "Jup", "city": "Paris"}');
INSERT INTO people VALUES
( 7, '{ "name": "Jib"}');
DROP TABLE cities;
CREATE TABLE cities
  (
    id NUMBER,
    city CLOB CONSTRAINT city_ensure_json 
    CHECK (city IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO cities VALUES
(101, '{"city": {"name": "SF", "population": 837442, 
                 "area": 121.4}, 
        "state": "CA", "country": "US"}');
INSERT INTO cities VALUES
(102, '{"city": {"name": "PA", "population": 64403, 
                 "area": 66.787}, 
        "state": "CA", "country": "US"}');
INSERT INTO cities VALUES
(103, '{"city": {"name": "NYC", "population": 8405837, 
                 "area": 783.8}, 
        "state": "NY", "country": "US"}');
INSERT INTO cities VALUES
(104, '{"city": {"name": "Tokyo", "population": 9071577, 
                 "area": 622.99}, 
        "state": null, "country": "Japan"}');
INSERT INTO cities VALUES
(105, '{"city": {"name": "Paris", "population": 10869000, 
                 "area": 2845}, 
        "state": null, "country": "France"}');
INSERT INTO cities VALUES
(106, '{"city": {"name": "Berlin", "population": 3517424, 
                 "area": 891.75}, 
        "state": null, "country": "Germany"}');
INSERT INTO cities VALUES
(107, '{"city": {"name": "Asilomar", "population": 100, 
                 "area": 2}}');

Absent JSON Property

The default SQL query behavior for absent JSON properties is discussed briefly first in order to establish the base semantics. 

  • Projection
    • Absent JSON properties are returned as SQL NULL
    • SELECT DISTINCT c.city.country FROM cities c;
    • Absent properties can therefore be checked for by using SQL NULL in a predicate
    • SELECT DISTINCT c.city.city.name, c.city.country FROM cities c WHERE c.city.country IS NULL;
  • Selection
    • Absent JSON properties do not match in predicates
    • SELECT DISTINCT c.city.city.name FROM cities c WHERE c.city.country = ‘US’;
  • Cartesian Product (join)
    • Absent JSON properties do not contribute to the cartesian product
    • SELECT cl.city.city.name, cr.city.city.name FROM cities cl, cities cr WHERE cl.city.country = cr.city.country;

The base semantics of absent JSON properties is consistent with their representation as SQL NULL.

Reasons for Absent JSON Properties

Why would JSON properties be present in some JSON structures of a JSON column, and absent from others? There are many possible legitimate reasons, some are discussed next:

  • Life Cycle Implementations (Application Design)
    • The life cycle of a concept (e.g., person) is implemented by adding property values over time. Properties are stored corresponding to the life cycle. For example, initially a person is added by name, with city and other information added later as they become known.
  • Schema Evolution
    • A property that was not required in the past (from a data model perspective) is added due to data model changes. Adding a new property to the already existing JSON structures might or might not be done and if not, different JSON documents have a different structure.
  • Errors
    • An error in the application system omits writing a property that should have been written. A real bug.
  • Intentional Design Decision
    • An application system was designed to only write properties that have values and to omit properties that do not have a value (yet). A corresponding design for deletion is that if a value of a property is deleted, the whole property is removed instead of being set to e.g. JSON null.

There is a design philosophy that emphasizes that even through JSON supports varying structures, they ideally should follow the same schema. This would mean that all JSON structures in a JSON column have the exact same set of properties. However, another design philosophy says that the dynamic nature of JSON structures should be taken advantage of avoiding the classical schema migration problem: a JSON structure should only contain those properties that make sense at this state of the structure life cycle.

Checking for Absent JSON Properties: JSON_EXISTS()

Since each JSON structure in a JSON column can have different schema it might be important to explicitly check for properties in SQL queries for query reliability and also documentation reasons.

Oracle 12c implements the function JSON_EXISTS() [https://docs.oracle.com/database/121/ADXDB/json.htm#ADXDB6246] that allows to check for existence of a property. For example:

SELECT DISTINCT c.city.city.name AS name 
FROM cities c 
WHERE json_exists(c.city, '$.country');

1

This query only selects those city names of cities that have their country specified.

More queries using JSON_EXISTS() follow:

CREATE OR REPLACE VIEW all_cities 
AS 
SELECT DISTINCT c.city.city.name AS name 
FROM cities c 
WHERE json_exists(c.city, '$.country');
SELECT p.person 
FROM people p, all_cities all_c 
WHERE p.person.city = all_c.name 
  AND json_exists(p.person, '$.city');

2

Details on JSON_EXISTS()

JSON_EXISTS() supports error handling clauses. For example, when a query accesses a JSON column with incorrect JSON structures, a choice is to ignore rows with incorrect JSON structures, include them into the result, or throw an error when processing the query. An example follows:

DROP TABLE incorrect_json;
CREATE TABLE incorrect_json
  ( id NUMBER, data CLOB);
INSERT INTO incorrect_json VALUES
(1, '{ "name": "item1", "value": "value1"}');
INSERT INTO incorrect_json VALUES
(2, '{ "name": "item2", "value": "value2_broken_json}');
SELECT * 
FROM incorrect_json ij 
WHERE json_exists(ij.data, '$.value' false ON error);

3

SELECT * 
FROM incorrect_json ij 
WHERE json_exists(ij.data, '$.value' true ON error);

4

SELECT * 
FROM incorrect_json ij 
WHERE json_exists(ij.data, '$.value' error ON error);

5

JSON_EXISTS() expects a column that can store json structures as strings; when applied to a non-string column, an error indicates this:

SELECT * 
FROM incorrect_json ij 
WHERE json_exists(ij.id, '$.missing_property');

6

When applied to a non-JSON column, the error message looks different:

DROP TABLE incorrect_json2;
CREATE TABLE incorrect_json2
  ( id VARCHAR(10), data CLOB);
INSERT INTO incorrect_json2 VALUES
('one', '{ "name": "item1", "value": "value1"}');
INSERT INTO incorrect_json2 VALUES
('two', '{ "name": "item2", "value": "value2_broken_json}');
SELECT * 
FROM incorrect_json2 ij 
WHERE json_exists(ij.id, '$' error ON error);

7

SELECT * 
FROM incorrect_json2 ij 
WHERE json_exists(ij.id, '$' false ON error);

8

SELECT * 
FROM incorrect_json2 ij 
WHERE json_exists(ij.id, '$' true ON error);

9

Summary

Varying schema in JSON structures can be explicitly explored using the JSON_EXISTS() function and therefore is a way to incorporate schema variation into SQL queries. This makes a powerful combination.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – SQL for JSON (Part 4): Basic Views

With JSON support available in Oracle 12c, it is possible to define views on JSON data.

Background

Complex JSON documents do not have to be queried completely every time, and views can provide an easy way to define an important subset of the JSON documents’ content. The relational database management system Oracle 12c implements the view mechanism for SQL and that is available for JSON columns as well.

In the following a few JSON document view examples are discussed on and implemented. Views are defined as well as accessed, directly queried as well as part of other SQL statements.

Example Data Set

First, an example data set is introduced around people living in cities and data about these cities.

DROP TABLE people;
CREATE TABLE people
  (
    id NUMBER,
    person CLOB CONSTRAINT person_ensure_json 
    CHECK (person IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO people VALUES
( 1, '{ "name": "Bob", "city": "SF"}');
INSERT INTO people VALUES
( 2, '{ "name": "Jake", "city": "PA"}');
INSERT INTO people VALUES
( 3, '{ "name": "Alice", "city": "NYC"}');
INSERT INTO people VALUES
( 4, '{ "name": "Jenn", "city": "Tokyo"}');
INSERT INTO people VALUES
( 5, '{ "name": "Frank", "city": "Berlin"}');
INSERT INTO people VALUES
( 6, '{ "name": "Jup", "city": "Paris"}');
DROP TABLE cities;
CREATE TABLE cities
  (
    id NUMBER,
    city CLOB CONSTRAINT city_ensure_json 
    CHECK (city IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO cities VALUES
(101, '{"city": {"name": "SF", "population": 837442, 
                 "area": 121.4}, 
        "state": "CA", "country": "US"}');
INSERT INTO cities VALUES
(102, '{"city": {"name": "PA", "population": 64403, 
                 "area": 66.787}, 
        "state": "CA", "country": "US"}');
INSERT INTO cities VALUES
(103, '{"city": {"name": "NYC", "population": 8405837, 
                 "area": 783.8}, 
        "state": "NY", "country": "US"}');
INSERT INTO cities VALUES
(104, '{"city": {"name": "Tokyo", "population": 9071577, 
                 "area": 622.99}, 
        "state": null, "country": "Japan"}');
INSERT INTO cities VALUES
(105, '{"city": {"name": "Paris", "population": 10869000, 
                 "area": 2845}, 
        "state": null, "country": "France"}');
INSERT INTO cities VALUES
(106, '{"city": {"name": "Berlin", "population": 3517424, 
                 "area": 891.75}, 
        "state": null, "country": "Germany"}');

Views

A few example views and possible uses are discussed in the following. The results of the queries are displayed in graphical representation provided by SQLDeveloper.

Simple View

CREATE OR REPLACE VIEW all_cities
AS
SELECT DISTINCT c.city.city.name 
FROM cities c;
SELECT * FROM all_cities;

1

SELECT p.person
FROM people p, all_cities all_c
WHERE p.person.city = all_c.city;

 2

SELECT p.person.name, all_c.city
FROM people p, all_cities all_c
WHERE p.person.city = all_c.city;

3

View Definition with Predicate

CREATE OR REPLACE VIEW us_cities
AS
SELECT DISTINCT c.city.city.name 
FROM cities c 
WHERE c.city.country = 'US';
SELECT * FROM us_cities;

 4

SELECT p.person.name, us_c.city
FROM people p, us_cities us_c
WHERE p.person.city = us_c.city;

5

View Definition with Computation

CREATE OR REPLACE VIEW city_stats
AS
SELECT 
  c.city.city.name 
    AS name, 
  ROUND(c.city.city.population / c.city.city.area, 2) 
    AS population_density
FROM cities c;
SELECT * FROM city_stats;

 6

SELECT p.person.name, c_s.name, c_s.population_density
FROM people p, city_stats c_s
WHERE p.person.city = c_s.name
ORDER BY c_s.population_density DESC;

7

Summary

The benefit of having JSON support in a relational database management system is that mechanisms and operations beyond querying are available with views being one. It is clear that the ability to define views is important, from a data model and data semantics perspective as well as from a data security and developer convenience viewpoint.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – SQL for JSON (Part 2): Basic Queries

This blog provides a small tour of basic SQL queries that operate on JSON in context of the Oracle Database 12c Release 1 (12.1.0.2.0).

Sample Data Set

The following very basic data set is used as an example in this blog. It is kept simple in order to be able to focus on the queries without having to deal with complex JSON objects at the same time.

First, a simple table is created that contains a JSON column. Next, some rows containing JSON objects are inserted.

DROP TABLE demo;
CREATE TABLE demo
( id NUMBER,
  player CLOB 
    CONSTRAINT player_ensure_json 
      CHECK (player IS JSON (STRICT WITH UNIQUE KEYS)));
INSERT INTO demo 
VALUES (1, '{"person": "Bob", "score": 10}');
INSERT INTO demo 
VALUES (2, '{"person": "Bob", "score": 20}');
INSERT INTO demo 
VALUES (3, '{"person": "Jake", "score": 100}');
INSERT INTO demo 
VALUES (4, '{"person": "Jake", "score": 200}');
INSERT INTO demo 
VALUES (5, '{"person": "Alice", "score": 1000}');

With the sample data set in place, we can now construct a complex query in several steps.

Selection and Projection

The most basic query selecting the complete data set is

SELECT * FROM demo d;

A basic projection extracting only the person from the JSON objects is

SELECT d.player.person FROM demo d;

A basic selection restricting the JSON objects is

SELECT d.player.person
FROM demo d
WHERE d.player.person IN ('Jake', 'Bob');

The syntax for accessing properties in JSON objects is in principle

<table alias>.<JSON column>.<path to JSON object key>

with variations on JSON array index references if required (http://docs.oracle.com/database/121/ADXDB/json.htm#ADXDB6246).

A more complex selection with an additional restriction is

SELECT d.player.person
FROM demo d
WHERE d.player.score > 0
  AND d.player.person IN ('Jake', 'Bob');

Ordering

Results can be ordered, for example, in the following way

SELECT d.player.person
FROM demo d
WHERE d.player.score > 0
AND d.player.person IN ('Jake', 'Bob');
ORDER BY d.player.person DESC;

Grouping

Results can be grouped also as a preparation for aggregation

SELECT d.player.person
FROM demo d
WHERE d.player.score > 0
  AND d.player.person IN ('Jake', 'Bob');
GROUP BY d.player.person
ORDER BY d.player.person DESC;

Aggregation

Different aggregation functions can be used to do some basic analysis

SELECT d.player.person,
  SUM(d.player.score),
  AVG(d.player.score),
  MIN(d.player.score),
  COUNT(*)
FROM demo d
WHERE d.player.score > 0
  AND d.player.person IN ('Jake', 'Bob');
GROUP BY d.player.person
ORDER BY d.player.person DESC;

Final Result

The final result is show here in table representation (copied from SQLDeveloper)

result

Inspiration

This example was inspired, in fact, by http://www.querymongo.com. There, the MySQL Query

sql_query

is translated to one of MongoDB’s query interfaces to

mongo_query

(web site accessed on 10/21/2014).

Summary

In summary, SQL functionality is available not only for the relational model in the Oracle Database 12c, but also for JSON-based data.

This makes the Oracle database a quite powerful JSON processing environment as querying JSON data is possible through the declarative SQL language.

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Oracle 12c – SQL for JSON (Part 1): Introduction to Native Support

The Oracle Database 12c Release 1 (12.1.0.2.0) [http://www.oracle.com/technetwork/database/enterprise-edition/overview/index.html] introduces native JSON support and SQL access to JSON. This blog post gives a first introduction.

SQL access to JSON: Native Support

SQL access to JSON is a significant development in itself, but native support in context of a relational database it is actually quite huge and exciting. This blog post series will provide a discussion of the various aspects over several installments, including the mixed use of SQL on JSON with SQL on relational tables.

But first things first.

2-Second Overview

This is a 2-second overview showing how to create a table that can store JSON data, how to insert a row containing a JSON object and how to query it with a simple query.

CREATE TABLE supplier
( id NUMBER NOT NULL
    CONSTRAINT supplier_pk PRIMARY KEY,
  supplier_doc CLOB
    CONSTRAINT supplier_doc_ensure_json 
      CHECK (supplier_doc IS JSON));
INSERT INTO supplier
VALUES (125,
'{
  supplierId: 125,
  "supplierName": "FastSupplier"}');
SELECT * FROM supplier;
        ID SUPPLIER_DOC
---------- -------------------------------------------------
       125 {supplierId: 125, "supplierName": "FastSupplier"}

That was easy 🙂

Creating a Table storing JSON

JSON data are stored in columns of regular tables. A constraint placed on a JSON column enforces JSON compliance:

CREATE TABLE supplier
( id NUMBER NOT NULL
    CONSTRAINT supplier_pk PRIMARY KEY,
  supplier_doc CLOB
    CONSTRAINT supplier_doc_ensure_json 
      CHECK (supplier_doc IS JSON));

Any attempt to insert invalid JSON data fails because of this constraint. Other columns are regular relational columns and they can be defined and constrained as necessary.

A JSON column stores JSON objects as well as JSON arrays as both are valid top-level JSON structures. Trying to insert scalars will fail. The following two insert statements are valid:

INSERT INTO supplier
VALUES (125,
'{
  supplierId: 125,
  "supplierName": "FastSupplier"}');
INSERT INTO supplier
VALUES (128,
'["empty_list_of_supplier"]');

While the JSON [json.org] standard allows duplicate members (aka, keys), many implementations only tolerate or even outright reject it. To avoid storing JSON objects with duplicate keys, the constraint of a JSON column can be extended:

CREATE TABLE supplier
( id number NOT NULL
    CONSTRAINT supplier_pk PRIMARY KEY,
  supplier_doc CLOB
    CONSTRAINT supplier_doc_ensure_json 
      CHECK (supplier_doc IS JSON (WITH UNIQUE KEYS)));

The following insert will fail with the additional constraint “WITH UNIQUE KEYS” (because of a duplicate keys), but would succeed otherwise.

INSERT INTO supplier
VALUES (126,
'{
  "supplierId": 126,
  "supplierName": "FastSupplier",
  "supplierName": "FS"}');

JSON has a defined syntax, however, many implementations relax it by e.g. allowing to state member names without quotes. To enforce a strict syntax, the constraint on a JSON table can be extended:

CREATE TABLE supplier
( id number NOT NULL
    CONSTRAINT supplier_pk PRIMARY KEY,
  supplier_doc CLOB
    CONSTRAINT supplier_doc_ensure_json 
      CHECK (supplier_doc IS JSON (STRICT WITH UNIQUE KEYS))

The following insert will fail (because one key is not enclosed in quotes), but succeed without the “STRICT” constraint:

INSERT INTO supplier
VALUES (125,
'{
  supplierId: 125,
  "supplierName": "FastSupplier"}');

With the various constraints and their combinations it is possible to restrict the flavor of JSON that is stored in the database. From an architecture perspective this means that the database can be the central point of enforcing conformity.

Since JSON data is stored in colums it is possible to create several columns in a table that contain JSON data. Here is an example of a table with two columns:

CREATE TABLE supplier
( id number NOT NULL
    CONSTRAINT supplier_pk PRIMARY KEY,
  supplier_doc CLOB
    CONSTRAINT supplier_doc_ensure_json 
      CHECK (supplier_doc IS JSON (STRICT WITH UNIQUE KEYS)),
  history_doc CLOB
    CONSTRAINT history_doc_ensure_json 
      CHECK (history_doc IS JSON (STRICT WITH UNIQUE KEYS)));

From a data modeling perspective this opens up a whole new dimension. A brief discussion follows later in this blog post.

Inserting JSON into a Table

Inserting JSON data into a table that has at least one JSON column is rather straight forward, as the previous examples have shown.

The insert statement must have a valid JSON object or JSON array in the position of the JSON column(s), or SQL NULL as the value of a JSON column can be unknown.

Querying JSON with SQL

Oracle Database 12c supports a number of functions to query and to manipulate JSON data. In the following only a first impression is given and the full set will be discussed in additional separate blog posts.

The most basic query was already introduced:

SELECT * FROM supplier;

Selecting scalars from JSON:

SELECT s.supplier_doc.supplierName FROM supplier s;

The select clause is what one would expect: the table name followed by the column name followed by the key name using dot notation. The return value is a table with a single column containing strings.

This query selects scalar value across different columns:

SELECT s.id, s.supplier_doc.supplierName FROM supplier s;

In order to be able to construct more interesting queries, a more complex JSON object is used (see the end of this blog post for its details – it is not included here as it is quite large).

Querying a JSON object looks like this:

SELECT s.supplier_doc.businessAddress FROM supplier s;

This returns a table with one column containing a string representing a JSON object. For those objects that do not have a ‘businessAddress’, a SQL null is returned.

Querying a JSON array is done like this:

select s.supplier_doc.shippers from supplier s;

Like previously, the dot-notation path leads to the JSON member that contains an array as value.

The queries included so far give a first impression of how to query JSON data in Oracle 12c. Upcoming blog posts will provide much more details on the query capabilities, amongst other discussions.

Table Design

With the introduction of JSON it is now possible to have a data model design that combines relational and JSON data modeling instead of having just a relational data model or just a JSON data model. A single table supports the combination of relational and JSON data types.

Some important design questions are:

  • Should top level scalar data items be separate columns, or top-level keys in a JSON object, or both?
    • In the above examples, a primary key column ‘id’ was designed, as well as a key ‘supplierId’ in the JSON object. The ‘id’ column enforces a primary key and supports access to the supplier identifier without accessing the JSON document. The JSON document, however, contains the supplier identifier to be self-contained.
  • Should all data be in one JSON document or is it more appropriate to have separate JSON objects in different columns?
    • In one of the above table creation statements two JSON columns where defined, one containing the supplier data, and another one containing history about the supplier. This supports the separation of data that should not be combined, e.g., for security or privacy reasons. An internal supplier history is not part of the operational supplier data and should be kept separate.

More design principles and best practices will emerge over time in this context in addition to the ones mentioned here so far.

Documentation

The documentation can be found here: [http://docs.oracle.com/database/121/ADXDB/json.htm#ADXDB6246]

Installation Notes

The Oracle Database 12c Release 1  (12.1.0.2.0) is available on Windows as well as Linux and Solaris at the time of this blog post [http://www.oracle.com/technetwork/database/enterprise-edition/downloads/index.html].

In case you want to run the Linux version on Windows, install Virtual Box (version 4.3.15 build 95923 (!) [https://forums.virtualbox.org/viewtopic.php?f=6&t=62615]) and then create a virtual machine with Oracle Linux 6. Once the VM is up and running, install the Oracle Database 12c Release 1 (12.1.0.2.0) in the VM and you are ready to go.

Example Large JSON Object

INSERT INTO supplier
VALUES (123,
'{
  "supplierId": 123",
  "supplierName": "FastSupplier",
  "rating": 5,
  "shippers": [{
    "shipperName": "TopSpeed",
    "address": {
      "street": "Sunrise",
      "streeNumber": 17,
      "city": "Sun City",
      "state": "CA",
      "zip": 12347
      }
    },
    {
    "shipperName": "PerfectPack",
    "address": {
      "street": "Solid Road",
      "streeNumber": 1771,
      "city": "Granite City",
      "state": "CA",
      "zip": 12348
      }
    },
    {
    "shipperName": "EconomicWay",
    "address": {
    "street": "Narrow Bridge",
    "streeNumber": 1999,
    "city": "Central City",
    "state": "CA",
    "zip": 12345
    }
  }],
  "businessAddress": {
    "street": "Main Street",
    "streeNumber": 25,
    "city": "Central City",
    "state": "CA",
    "zip": 12345
  },
  "warehouseAddress": {
    "street": "Industrial Street",
    "streeNumber": 2552, 
    "city": "Central City",
    "state": "CA",
    "zip": 12346
    }
  }');

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.