Document-oriented NoSQL Databases: Oracle NoSQL’s Shift in March 2014

I believe Oracle NoSQL extended and expanded from a key/value store to a document-oriented NoSQL database in March 2014. If you concur, then this was a major shift in the NoSQL arena of document-oriented databases.

Document-Oriented NoSQL Databases

In a narrow interpretation, document-oriented NoSQL databases use JSON or BSON as the query interface data representation. In a wider interpretation, any scalar data type and any composite data type (like maps or array) is available at the query interface without those necessarily complying to the JSON encoding or interpretation.

Some of the known document-oriented database are MongoDB, CouchDB and RethinkDB, amongst many others. The site http://db-engines.com/en/ranking/document+store has a categorization and ranking readily available.

In addition to storing documents (aka, complex data structures with scalar, array and map data types), another important criteria is the ability to query based on the document content by e.g. selecting documents where specific properties have specific values or comply to specific complex predicates (in the general case).

Key/Value Stores

Key/value stores (k/v stores) are different from document-oriented databases. Key/value stores persist a value for a given key. Retrieval is based on the key and values cannot be used in query predicates. Some k/v stores have structured keys or composite keys that support querying a range or set of values.

Oracle NoSQL, for example, is supporting structured and composite keys (http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuide/singleget.html, http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuide/multiget.html).

Aerospike is another example of a k/v store (https://docs.aerospike.com/display/V3/Key-Value+Store+Guide).

K/V Stores: Major Shift

Oracle NoSQL extended its data model from a pure k/v data model to a document-oriented data model in March 2014.

The available simple data types are (cited from http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuideTables/tablesapi.html)

  • Double
  • Float
  • Integer
  • Long
  • Java byte array
  • Java String

And the available complex data types are (cited from http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuideTables/tablesapi.html)

  • Array (An array of values, all of the same type)
  • Enum (An enumeration, represented as an array of strings)
  • Fixed Binary (A fixed-sized binary type used to handle binary data where each record is the same size. It uses less storage than an unrestricted binary field, which requires the length to be stored with the data)
  • Map (An unordered map type where all entries are constrained by a single type)
  • Records

In addition, indexes can be created on the ‘value’ part, aka, the documents: http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuideTables/createindex.html. And based on indexes, the database can be queried without having to specify a key: http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuideTables/indexread.html.

The latter functionality extends the characteristics of the Oracle NoSQL database from k/v store into a document-oriented NoSQL database. Clearly secondary indexes are not yet a very expressive query language, however, they are a major first step towards it.

What does it mean?

The interesting aspect is that Oracle as a major database vendor moved with a separate and dedicated NoSQL product first into the k/v store space, and now additionally into the document-oriented NoSQL space. There are many ways to read this move, of course, but from a technical viewpoint this emphasizes the importance of a structured data model in the NoSQL space. In addition, OracleNoSQL allows a per-document Avro schema, making it a truly per-document schema database.

Aerospike, for example, also supports complex data types and secondary indexes, even though it is also widely seen as a k/v store (e.g., http://db-engines.com/en/ranking/key-value+store).

Summary

Clearly, the document-oriented segment of NoSQL databases gets more and more crowded and the attention of major players in the database space.

What hasn’t been done is a comparison of the data model and query interface expressiveness of the major players in the document-oriented NoSQL database segment. Once done, and in case that reveals that the expressiveness is the same or almost the same, then the systems are exchangeable from a functionality viewpoint and will distinguish themselves ‘only’ in terms of non-functional properties like latency, throughput, transactions, cost, support, and so on.

Advertisement