MongoDB introduced support for JSON Schema through $jsonschema. Let’s explore this new functionality a bit in this blog.
$jsonschema
The functionality is introduced here: https://docs.mongodb.com/master/reference/operator/query/jsonSchema/#op._S_jsonSchema It states “$jsonSchema can be used in a document validator, which enforces that inserted or updated documents are valid against the schema.”
A first item to note is that this approach is supporting BSON types (http://bsonspec.org/, https://docs.mongodb.com/master/reference/operator/query/type/), not just JSON structures (https://www.json.org/), using a specific property “bsonType” that is not part of the JSON Schema standard (http://json-schema.org/).
A second observation is that the schema specification is inline with the collection creation and cannot refer to a separate JSON schema file or JSON object representing a JSON schema.
JSON Schema Validator Example
Let’s use the example of the first blog in this series, create a schema for it and use that as a constraint for the “orders” collection. Then documents are added to the collection (and there seem to be errors as well). For reference the version used is: MongoDB server version: 3.6.0.
> mongo
> use schema_exploration
> db.createCollection("orders", { "validator": { "$jsonSchema": { "bsonType": "object", "required": ["orderId", "orderDate", "orderLineItems"], "properties": { "orderId": { "bsonType": "int", "description": "Order Identifier: must be of type int and is required" }, "orderDate": { "bsonType": "date", "description": "Order Date: must be of type date and is required" }, "orderLineItems": { "bsonType": "array", "items": { "bsonType": "string" }, "description": "Order Line Items: must be of type array and is required" } } } } })
{ "ok" : 1 }
A quick note: “bsonType” can be used in all levels in order to refer to BSON types, not just on the top level.
> db.orders.insert({ "orderId": NumberInt(1), "orderDate": new Date("2017-09-30"), "orderLineItems": [{ "itemId": 55, "numberOrdered": 30 }, { "itemId": 56, "numberOrdered": 31 }] })
WriteResult({ "nInserted": 0, "writeError": { "code": 121, "errmsg": "Document failed validation" } })
Along the way I ran into a validation issue as I constraint the array elements to strings, rather than objects, as used in the example of the first blog in this series. So I made a schema definition mistake.
To note is that the response on the shell does not indicate what the problem was making debugging hard, especially when large and complex schemas are to be debugged.
> db.orders.insert({ "orderId": NumberInt(1), "orderDate": new Date("2017-09-30"), "orderLineItems": ["a", "b"] })
WriteResult({ "nInserted": 1 })
Once I realized the mistake I made, I inserted a document complying to the schema in order to make sure I identified the issue correctly.
JSON Schema Validator Update
Obviously, after defining a wrong schema, the correct schema should be used as validator.
This is the correct schema:
{ "bsonType": "object", "required": ["orderId", "orderDate", "orderLineItems"], "properties": { "orderId": { "bsonType": "int", "description": "Order Identifier: must be of type int and is required" }, "orderDate": { "bsonType": "date", "description": "Order Date: must be of type date and is required" }, "orderLineItems": { "bsonType": "array", "items": { "bsonType": "object", "properties": { "itemId": { "bsonType": "int" }, "numberOrdered": { "bsonType": "int" } } }, "description": "Order Line Items: must be of type array and is required" } } }
And this is the command to update the validator:
> db.runCommand({ "collMod": "orders", "validator": { "$jsonSchema": { "bsonType": "object", "required": ["orderId", "orderDate", "orderLineItems"], "properties": { "orderId": { "bsonType": "int", "description": "Order Identifier: must be of type int and is required" }, "orderDate": { "bsonType": "date", "description": "Order Date: must be of type date and is required" }, "orderLineItems": { "bsonType": "array", "items": { "bsonType": "object", "properties": { "itemId": { "bsonType": "int" }, "numberOrdered": { "bsonType": "int" } } }, "description": "Order Line Items: must be of type array and is required" } } } }, "validationLevel": "strict" })
{ "ok" : 1 }
Some background on the command used is here: https://docs.mongodb.com/master/reference/command/collMod/.
Following is an attempt to add one more of the (now mismatching) documents:
> db.orders.insert({ "orderId": NumberInt(1), "orderDate": new Date("2017-09-30"), "orderLineItems": ["a", "b"] })
WriteResult({ "nInserted": 0, "writeError": { "code": 121, "errmsg": "Document failed validation" } })
As it should be, the insert fails.
And here the insert of a now correct document:
> db.orders.insert({ "orderId": NumberInt(1), "orderDate": new Date("2017-09-30"), "orderLineItems": [{ "itemId": NumberInt(55), "numberOrdered": NumberInt(20) }, { "itemId": NumberInt(56), "numberOrdered": NumberInt(21) }] });
WriteResult({ "nInserted": 1 })
Collection Inconsistency: Mismatch of Schema and Documents
There is an interesting issue appearing at this point. The new schema does not match all existing documents in the collection. Or the other way around: the collection now contains documents that do not match that schema.
> db.orders.find()
{ "_id": ObjectId("5a2022c3fb460d15db9ec73e"), "orderId": 1, "orderDate": ISODate("2017-09-30T00:00:00Z"), "orderLineItems": ["a", "b"] } { "_id": ObjectId("5a202322fb460d15db9ec741"), "orderId": 1, "orderDate": ISODate("2017-09-30T00:00:00Z"), "orderLineItems": [{ "itemId": 55, "numberOrdered": 20 }, { "itemId": 56, "numberOrdered": 21 }] }
MongoDB did not flag that there are documents in the collection that will not match the new schema (even though the validation level strict was used).
Adding a validation action with value of “error” does not change the situation, either.
Implication to Semantics
Given that the schema of a collection can be changed at any time, and given that MongoDB does not fail the schema update based on mismatching documents already in the collection, examining the schema is insufficient to understand the structure of the documents in a collection.
So a collection with a schema does not ensure that all documents in that collection are schema compliant. It rather insures that from the point in time the schema was added or updated documents will have to comply. Previous documents in the collections are not affected.
Summary
The notion of “schema” in context of MongoDB is very different from the notion of “schema” in context of relational database management systems. In MongoDB the documents in the collection do not have to comply to the schema; they only do have to comply at time of insertion.
There will be more exploration coming up in the next blog on this topic in order to further understand the semantics of “schema” in context of MongoDB.
Go [ JSON | Relational ] SQL!
Disclaimer
The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.