There is an almost infamous bug in context of MongoDB, called SERVER-831. This blog puts SERVER-831 in context of JSON SQL.
Upfront
In order to not loose the context, the bug and one of the use cases were captured as PDFs and are attached here:
- https://jira.mongodb.org/browse/SERVER-831 (reference to PDF: server-831-positional-operator-matching-nested-arrays)
- https://gist.github.com/durran/342738 (reference to PDF: embedded_update)
Anatomy of SERVER-831
Server 831 has two main aspects: a selection combined with an update. In the following only the selection is discussed; the update part follows down the road when update in context of JSON SQL will be introduced.
Basically, the selection selects based on the value of array elements without knowing the precise array index where the values are located. The array index is not part of the selection criteria, only the content of array elements.
Data Set
These are the documents in the collection server831 used for illustration in this blog:
{"a":[5,4,3,2,1]} {"a":[5,10,15,20,25]} {"a":[1,2,3,4,5]} {"a":[{"_id":7},{"_id":8}]} {"a":[{"_id":8},{"_id":7}]} {"a":[null,[0,0,7],null]} {"a":[true,false],"b":true} {"a":[true,false],"b":false} {"a":[true,[null],false]}
Predicating Array Elements by Array Index
So far it was possible to create a selection based on the array index. For example, the following query returns all documents that contain an array “a” where the value at index 0 is 5.
select {*} from server831 where a.[0] = 5
returns
{"a":[5,4,3,2,1]} {"a":[5,10,15,20,25]}
Predicating Array Elements by Content
Sometimes there is the situation where the array index is unknown and the client wants to retrieve all documents where at least one array element has a specific value. Referring to every array element is denoted as [*] in the JSON SQL syntax (instead of referring to a specific array index).
For example, the following query selects all documents where at least one of the array elements of “a” has the value 5.
select {*} from server831 where a.[*] = 5
results in
{"a":[5,4,3,2,1]} {"a":[5,10,15,20,25]} {"a":[1,2,3,4,5]}
More interesting queries are the following examples that reach into the value(s) of array elements:
select {*} from server831 where a.[*]._id = 7
returns
{"a":[{"_id":7},{"_id":8}]} {"a":[{"_id":8},{"_id":7}]}
Not only scalar literals can be used, but non-scalar literals as well.
select {*} from server831 where a.[*] = [0,0,7]
returns
{"a":[null,[0,0,7],null]}
Selection does not have to be based on literals:
select {*} from server831 where a.[*] = b
returns
{"a":[true,false],"b":true} {"a":[true,false],"b":false}
The latter allows comparison of values inside documents.
Any number of levels can be referenced, for example
select {*} from server831 where a.[*].[*] = null
returns
{"a":[true,[null],false]}
SERVER-831 Use Case
The document provided in one of the use cases in SERVER-831 is as follows (with cleaned up syntax errors):
{ "_id": 1, "name": "Dave Gahan", "medications": [ { "_id": 23, "name": "Dilaudid", "type": "Rx", "prescriptions": [ { "_id": 13, "quantity": 60, "started": "2009-01-01" }, { "_id": 77, "quantity": 45, "started": "2009-02-01" } ] }, { "_id": 41, "name": "Oxycodone", "type": "Rx" } ]}
The corresponding query in form of JSON SQL is (assuming the data is stored in collection uc831):
select {*} from uc831 where _id = 1 and medications.[*]._id = 23 and medications.[*].prescriptions.[*]._id = 77
Summary
Selection based on array element values without the knowledge of the array index is an extremely powerful query feature of JSON SQL and is most likely useful in many different use cases.
The syntax of JSON SQL had to be modified only minimally without any restriction or loss of query expressiveness.
Go [ JSON | Relational ] SQL!
Disclaimer
The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.