Relational Database Management System (RDBMS)

Why a blog on Relational Database Management System (RDBMS)? Isn’t there enough said about this type of database? Yes, there is. This post is to call out a few aspects of the relational model that I want to refer to in context of RDBMSs for subsequent discussions. In addition, it re-establishes terminology.

Normalization

Briefly, normalization attempts to remove data duplication and data dependencies so that an update of a data item has to be done only once for the whole data set to be consistent again (http://en.wikipedia.org/wiki/Database_normalization).

Universal Relation

The Universal Relation is mostly an assumption that states: it is possible to store all data in one big, wide table. For a discussion, see Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press 1989, ISBN 0-7167-8162-X.

In a way, the universal relation is the ‘opposite’ of a perfectly normalized relational schema. It also assumes that the semantically same attributes are named the same way in different relations. As a consequence, many columns might have null values. But for the thought experiment, this is perfectly valid.

Part-Of Relationship

The part-of relationship consists between an owning object and an owned one whereby the owned object’s life cycle is tightly bound to the owning one: if the owning object is deleted, the owned one will be deleted, too. The owned object exists in context of the owning object and does not have a life on its own.

In the relational model there is a choice how this can be modeled. In principle, the owned objects can be in their own relation, separate from the owning objects in their own relation. In addition, there is a foreign key relationship from the owning to the owned relationship.

Alternatively, the owned objects can be in the same relation as the owning objects. Both would be in the same row. This is basically the case when the owned object is a scalar domain value. In this case it is almost automatically in the same relation as the owning data.

De-Normalization

De-normalization is about combining data that would otherwise be in separate relations when being fully normalized (http://en.wikipedia.org/wiki/Denormalization). This introduces redundancy and dependencies in the model for the sake of access improvement and access speed.

In a sense, the universal relation is the ultimate form of de-normalization.

Schema

In the relational database management system world the data are stored as rows in tables. A table provides a fixed structure whereby every column defines one and only one data type. Any row stored in a table, therefore, must comply to the types of the columns. The only exception is ‘null’ since ‘null’ can be of any data type. The significance is that tables are well-defined in terms of data types and rows have to comply to the table definition. This makes it a strictly typed system: the set of all table definitions constitute the schema.

Schema changes are possible in most RDBMS implementations. It is possible to add, to change or to delete columns. One consequence is that all rows in the table have to change so that their values comply to the new column definitions. Adding and removing tables is possible, too. The ‘hard’ part about changing a table is that all rows have to be changed accordingly; it is not possible to only have the change apply to future rows or when rows are updated.

Summary

This blog recalled a few important concepts from the relational model and relational database management systems. These will be referred to and used in future blogs.

Advertisement

“A Manager Decided …”

The term ‘A Manager Said’, or ‘A Manager Asked’, or ‘A Manager Decided’ is going to stand for decisions that are being made without knowing the rationale behind it. It is going to be used in the blogs of this site in order to indicate ‘facts’ that conceptual engineering and software engineering has to deal with – if these teams like it or not.

While it has a Dilbert flavor to it, sometimes decisions are made for reasons beyond the control (or understanding :-)) of the engineering team. So, the approach taken in the blogs is: so be it, let’s move forward on this basis.

Document-oriented Database (NoSQL Database Category)

‘Document-oriented database’ refers to databases that store their data in terms of documents. More specifically, structured documents and, by going with the times,  documents in JSON (http://www.json.org/) format at that.

Some examples are MongoDB (http://www.mongodb.org/) or RavenDB (http://ravendb.net/).

These databases belong to a category of databases called ‘NoSQL Databases’ (http://en.wikipedia.org/wiki/NoSQL). This category emerged over the last few years and in this blog site is used in that sense.

Databases that do not prescribe the content format or enforce a specific data model can store JSON by definition; so database systems that did not decide on a particular data representation upfront but left it open, can be considered document-oriented databases in case their payloads are documents.