Trending: Multi-Interface and Multi-Data-Model Databases

An interesting development, especially in the NoSQL database space, is the development towards multi-interface and multi-data-model databases, and sometimes both at the same time. While it provides flexibility, it also brings challenges.

Multi-Data-Model Support

In the relational database space, supporting different data models concurrently is not a novelty. Relational databases started off with the relational data model implementation, and later on some of the systems extended the relational model mainly by objects, XML or JSON.

Some databases in the NoSQL space are starting to evolve, too, in this manner and are providing more than one data model concurrently. Some interesting examples are discussed next.

One example in the NoSQL space is Oracle NoSQL [http://www.oracle.com/technetwork/database/database-technologies/nosqldb/]. This system supports a key/value data model whereby the key is used to identify values that are not interpreted by the database itself. In addition, values can be of complex types that are actually interpreted by the database, e.g., in secondary indexes.

Aerospike is another example in the NoSQL space [http://www.aerospike.com/]. Aerospike provides a data model consisting of basic and complex types. In addition, it supports language-native formats as well as large data types that have a specific operational characteristics and data type operations tuned for scale.

Like some relational databases extended data models over time to support specific use cases in a more direct way, some NoSQL databases are also going down that path to more directly support specific application developer needs.

Multi-Interface Support

From an application development perspective a single query API is certainly preferable that provides the complete query expressiveness required. However, especially in the new area of NoSQL databases, it is not clear yet what a good query API actually looks like. This can be observed by different systems providing different query API alternatives.

MongoDB [http://www.mongodb.org/] has a document query interface based on query patterns in form of JSON documents (“Query Documents”). In addition, it provides a map/reduce interface and aggregation pipelines. There are three different APIs that an application developer can choose, and, in addition, they overlap in their functionality. This means that, depending on the query, it can be expressed in all three of them.

Aerospike [http://www.aerospike.com/] provides different language drivers in addition to an Aerospike Query Language.

Cloudant [http://www.cloudant.com], in contrast, supports a REST-api as well as a query interface based on query documents (similar to MongoDB).

Not strictly an external interface, but very important for specific use cases, is the ability to add functionality dynamically to the database in order to move some processing from the application systems into the database itself: user defined functions. For example, MongoDB allows adding functionality through JavaScript functions, whereas Aerospike supports two different types of Lua functions: record user defined functions operate on single records, whereby stream user defined functions support distributed processing.

The Good

Unquestionably, the good part about multi-interface and multi-data-model databases is that an application developer can choose the best combination of data model and access interface for a particular development task. The impedance mismatch between the problem and the solution can be minimized with an appropriate choice.

This also means that developers need to understand the pros and cons of every combination and that requires to go through a learning curve. Going through that learning curve might pay off big time.

In addition, application development teams will have to manage a wider range of implementation variations in terms of application design and engineering, but also in terms of bug fixing and application code maintenance.

The Tricky

The tricky part of multi-interface and multi-data-model databases is that all combinations can be used concurrently, in production as well as post-production (e.g. analytics). Unit and functional tests as well as performance and scale tests become a lot more complicated as they have to test the concurrent execution of various combinations.

Furthermore, many queries can be expressed in different interfaces as those tend to overlap in query expressiveness. So an application developer needs to clearly understand the pros and cons of the query execution that underlies a specific query interface.

Hopefully the query semantics is the same for all combinations (meaning, for example, that predicates are evaluated the same way) and that concurrent use of data models and interfaces does not negatively impact the various clients in terms of concurrency, scale and performance. Any bug introduced through a discrepancy might be very difficult to reproduce and fix.

Summary

While multi-interface and multi-data-model databases are a powerful technology, there is considerable impact to the application system engineering activities in terms of knowledge acquisition, development, test and maintenance.

While database vendors certainly strive to have all combinations work in harmony, there might be edge cases where one combination does not give the same result as a different one. From an application development perspective test coverage should ensure semantic equivalence of the used combinations so that misinterpretations or wrong results are avoided.

Advertisements