As outlined in Part 1 of this series (https://realprogrammer.wordpress.com/2013/11/02/schema-free-database-part-1-an-oxymoron/), a ‘schema-free database’ is an oxymoron and in fact the notion of schema is changing from a more restrictive to a more flexible interpretation in context of NoSQL database technology.
So it is only consequential to ask the question the other way around (as a thought experiment): is it possible to build a relational database management system that does not enforce a schema, and if so, how would such a system look like on an abstract level?
Yes, it is possible to have a non-schema-enforcing RDBMS. Let’s discuss two variations next.
Definition of No-Schema-Enforcing Relational Database Management System
What functionality would be altered in order to provide a no-schema-enforcing RDBMS? If it were possible to create a table without specifying columns (aka, only a table name), and then to insert, update and delete rows, then a ‘schema-free’ RDBMS would be in place. This would mean in detail:
- Rows do not have to comply to a schema when inserted into a table. Different rows in the same table could have different attributes (columns) and the same attributes of different rows could have different domains (flexible type system).
- By defining a table without specifying columns (names and domains), a table would not define a primary key, either (neither a simple, nor a composite key). Applications inserting or updating rows can behave nicely and add properties with values that comply to the primary key semantics, but the RDBMS would not be aware of it and consequently would not enforce primary key compliance.
- By the same token, foreign keys would not be enforced by the RDBMS for the same reasons.
- Since no primary key enforcement is in place, duplicate rows will not be prevented by the RDBMS and any supervision is left to the application systems.
- Indexes are independent of schema specification and assuming that indexes are maintained on tables, not all rows might be present in an index if the attributes defined by the index are not contained in a row.
Surprisingly (or not), defining a no-schema-enforcing RDBMS is pretty straight forward.
Variation on No-Schema-Enforcing RDBMS
An interesting variation of a no-schema-enforcing RDBMS could be that a schema, primary keys, foreign keys, etc., are specified as usual, however, without being actively enforced; instead, warnings are given by the RDBMS. For example, a row not complying to the schema can actually be inserted, but the result would not be a ‘OK’, but a warning indicating a schema violation.
This can be described as a ‘middle ground’ in widening the schema interpretation where the RDBMS is aware of a schema and warns of violations without rejecting the various DML operations.
Characterization of No-Schema-Enforcing RDBMS
Could a no-schema-enforcing RDBMS (any of the variations) be a useful database management system? Yes, as it would be the equivalent (on the relational model) to NoSQL databases (on JSON/BSON model or key value model).
For use cases where the flexible schema interpretation is key, such a no-schema-enforcing RDBMS could fit the bill (possibly better) than a NoSQL database system if the use case is fundamentally relational in nature (as opposed to e.g. hierarchical or key/value) and if SQL as the query language is important.
There are additional areas in a RDBMS that will have to change their behavior in a no-schema-enforcing implementation. Only briefly (and not exhaustively), these are
- Triggers. Triggers are specified on tables and state changes of rows. If particular attributes are referenced inside the trigger, then not every update, insert, read or delete will execute the trigger logic.
- Stored procedures. Stored procedures often have parameters of specific types and assume a specific set of attributes when processing rows. In a no-schema-enforcing situation the stored procedure has to be able to deal with variations of rows.
- Functions and function extensions. Functions have to be changed similarly to stored procedures. Not only from the viewpoint of parameters, but also the processing logic.
- Aggregation. Aggregation will have to change in various ways as the various aggregation functions cannot assume that all attributes are of the same type. Neither can they assume that all attributes are actually present in all rows of a table.
In principle, every concept and every implementation aspect of a RDBMS needs to be re-examined wrt. a wider and more flexible interpretation of ‘schema’. NoSQL systems, by their definition and approach, started with a wider interpretation and consequently made all the conceptual and implementation decisions. They are one source of approach in this regard.
If you plan to explore or to build a no-schema-enforcing RDBMS, please contact me.