SQL for JSON Rationalization Part 1: Setting the Scene

Is it possible to technically rationalize SQL for JSON, aka, extend the SQL syntax and semantics to support JSON structures without flame wars during discussions? Let’s see.

Goal: Extend SQL to Support JSON Data Structures

The goal of the forthcoming blog series is to extend SQL with support for JSON data structures (aka, “JSON documents” or simply “documents”).

There is no consensus agreement in the community that this is the way to go at all; many dispute that extending SQL is the right direction and instead propose a different approach for querying JSON documents like specialized query APIs, Object-Relational-Mapper technology, representation of queries as JSON documents themselves, to name a few alternative proposals.

This is fine, as everybody is free to discuss their point of view in their favorite forum – no issues from my side:-)

However, this blog series is about extending SQL with support for JSON data structures and there is no justification necessary as this is a purely technical discussion. The attempt is made to impact SQL as little as possible (minimalist approach) while providing full JSON support.

Approaches to Extending SQL

There are different starting points of how to extend SQL to support JSON data structures:

  • start with a non-SQL approach and extend/adjust to SQL along the way
  • start with SQL in the first place

The latter approach is taken here. Given a SQL syntax (not necessarily a specific syntax from a specific standard or a specific system implementation) the discussion will revolve around SQL syntax and SQL semantics extension.

However, the discussion will not revolve around efficiency of SQL execution at this point and what the added cost of JSON data structure support is, if any. This blog series is laser-focused on syntax and semantics (only).

Direction From Here

The direction from here is (a) forward and (b) along the various SQL operations like for example

  • Projection
  • Selection
  • Cartesian Product
  • Join

Organizing the discussion along the SQL syntax and semantics is the framework for the discussion about the extension to support JSON data structures.

In addition, the JSON definition itself will be discussed and some missing pieces will be rationalized that are required in context of SQL. Operations or semantics that are not defined by the JSON standard are introduced as necessary, like for example

  • the starting index of an array (zero or one?)
  • the definition of normalization and equality
  • the difference between JSON null and that absence of a value (or absence of a pair)

Finally, the JSON terminology is incomplete and missing parts will be fit in for the purpose of this blog, for example

  • to name the string and value in a JSON pair
  • to name the elements of an array

This is not meant to be a proposal for refining the current JSON definition (in or outside of a standard). JSON is taken as is even though many wish to change it; however, in order to focus the discussion, no changes to JSON will be directly proposed.

Summary

Extending SQL “properly” to support JSON is going to be fun and I am looking forward to interesting discussions along the ride.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Flashback/Look-ahead – SQL: The Kingmaker of NoSQL Databases?

Over 1 ½ years ago I suggested that SQL will turn out to be an important factor for the success of NoSQL databases (https://realprogrammer.wordpress.com/2014/03/04/sql-the-kingmaker-of-nosql-databases/). Did that happen?

NoSQL / OldSQL / NewSQL: SQL Language Support on the Rise

When surveying a subset of the various database systems it becomes apparent that the support for relational SQL as well as JSON-supporting SQL is increasing. Right now this does not necessarily mean that there is an agreement on the particular language syntax, its semantic interpretation or execution style (declarative vs. functional vs. procedural). However, there is an (implicit) agreement that a (declarative) query language is necessary or advantageous, that SQL is a guiding principle and that JSON needs to be supported to various extent. A (not necessarily complete) list of different systems follows that have a (JSON-supporting) SQL language implemented in one form or another:

From the above selection it becomes clear that a lot of NoSQL, OldSQL, and NewSQL systems invest in SQL as well as JSON-supporting SQL and that the number of systems that support (JSON-supporting) SQL is definitely increasing.

Given that implementing the efficient execution of SQL is a significant engineering effort (and investment), JSON-supporting SQL is obviously seen as an important feature.

As an aside, an intriguing question is: why do database systems that fall into the category of “NoSQL” databases start providing SQL? It appears that the “No” in “NoSQL” is being re-interpreted from “no-SQL” to “no-Relational Model” with JSON-supporting SQL support. One possible answer to the question is the increasing complexity of queries that are being issued. While simple interfaces are good to get going initially, over time, as applications become more involved, complex queries reflect the increasing domain complexity and that is going to be reflected in the query language as well (for a taste, see here: https://realprogrammer.wordpress.com/2014/10/27/oracle-12c-sql-for-json-part-2-basic-queries/ – toward the end of the blog). Another (additional) possible answer is that existing tools (e.g. reporting tools) that are JDBC compliant and can process relational data should be able to connect also.

JSON-supporting SQL Language: Discussions are Ongoing

Discussions about the benefit of JSON-supporting SQL in context of NoSQL databases are coming up repeatedly (e.g., LinkedIn group discussions: https://www.linkedin.com/groups/2085042 (NoSQL Group)). Interesting enough, most of the time, the discussion is not separated or structured by syntax vs. semantics vs. functionality. SQL is developed in context of the relational model and that does not preclude complex data structures like JSON, the opposite is true. Object / relational functionality as well as NF2 work deals with complex structures, including arrays and maps. So it would be prudent, in my mind, to have a separation in the discussion of

  • Data structure support (aka, JSON)
  • Query functionality
  • Query language syntax

I expect the discussion of JSON-supporting SQL to lead towards a declarative language with SQL syntax, based on the relational model, paired with complex data types (e.g. JSON) and a schema-per-object or schema-per-row semantics.

The interesting aspect on the SQL syntax and interpretation is that systems that also support the relational model extend those to support JSON, and these systems will extend SQL, not replace it (as can be seen already).

SQL Support for NoSQL Databases: Best Approach Forward?

In my mind the best approach forward would be standing on the shoulder of giants: take SQL and extend it to support JSON in a very careful but very expressive way, learning from the object / relational work as well as the NF2 work that took place a long time back. Stay true to the learnings of the benefits of a declarative language and the underlying possibility of optimization, be it functional or non-functional. In short, add the domain type JSON and cater for a schema-per-document model.

If JSON support is one of the common functional denominators of most of the databases then the distinction is not really achieved by variations of (JSON-supporting) SQL, but by variations of the non-functional support like scale, throughput, low latency, safety, reliability, etc.

I believe that JSON-supporting SQL will be converging, while the non-functional features will set the various systems further and further apart in order to provide the designed-for sweet spot and establish their competitive advantage.

Summary

Clearly, the jury is still out, however, compared to 1 ½ years ago, whispers are emerging that JSON-supporting SQL is becoming an important feature of NoSQL, OldSQL and NewSQL databases.

Clearly, the pendulum swings toward the SQL language grammar and functionality that supports JSON. And that is a good thing in my opinion for the various obvious reasons.

Clearly, discussions keep coming up about JSON-supporting SQL, the various styles and language proposals. This is another reason to believe that SQL support in NoSQL databases and in context of JSON is important.

Let’s check in again into this topic next year some time. It is always interesting to follow such a hotly debated topic and establish checkpoints along the way..

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Software Design in Context of Database Failover (Part 5): Architecture Using Non-Transactional Persistent Resource Managers

Non-transactional persistent resource managers are tricky in context of database fail-over. A short discussion shines the light on some of the problems.

Non-Transactional Persistent Resource Managers

File systems and queuing systems are in general non-transactional persistent resource managers that are typically used outside transaction control, in many cases because the technology chosen does not support transactions in the first place (some implement transaction functionality, many do not; here I focus on those that do not). Files can be created, deleted or their contents manipulated; queue elements can be enqueued or dequeued from queues (in FIFO order, priority order, or any other supported queuing policy).

Because operations like inserting, updating or removing data items from non-transactional persistent resource managers are executed outside transaction control, they (a) cannot be coordinated in the same transactions of transactional resource managers, and (b) in case of a failure, the state of the data items is unclear since the non-transactional resource manager does not provide any transactional guarantees in failure situations (see the appendix below for the taxonomy on data consistency).

For example, when a data item is enqueued in a queue and in that moment the queuing system fails (by itself or because the underlying hardware failed) then there is no defined outcome for that enqueue operation. It could be that the data items is properly enqueued and all is fine. It is possible that the data item was not enqueued at all. Worse, it might be possible that the data item was enqueued, but not all supporting data structures were fully updated inside the queuing system, leaving it itself in an inconsistent state still requiring recovery actions.

Best Effort Consistency

From a software architecture viewpoint, non-transactional persistent resource managers are best-effort systems and this has to be kept in mind when they are used and deployed. With the rather reliable hardware servers nowadays this limiting property can easily be forgotten or pushed aside as the mean time between failures is rather long.

However, when data consistency is critical, and non-transactional resource managers have to be used, effort can be put in place to mitigate at least some of the potential data consistency problems.

Data Consistency through Data Duplication

A fairly straight-forward approach to integrate non-transactional resource managers is to store the data items twice: once in the non-transactional resource manager, and once in a transactional resource manager. Any update or delete must take place in both systems (however, as discussed, cannot be done inside a single transaction across both).

For example, every item enqueued in a non-transactional queue could also be inserted into a relational table. If an item is dequeued, it is also removed from the transactional table. The logic could first enqueue the data item into the queue, and when successful, insert it into the table. Upon dequeue, the same applies. The items is first dequeued from the queue, and if successful, removed from the table. The invariant is that a queued data item is only considered consistent if it is enqueued in the queue and also present in the table at the same time. 

Studying only the enqueue case, there are several failure cases:

  • Enqueue fails.
  • Enqueue succeeds, but system fails before the insert into the table took place.
  • Enqueue succeeds, but insert in the table fails.

In the first case, the enqueue fails and the insert into the table will never take place. So the invariant is fulfilled. In the second case the item is in the queue, but not in the table. The invariant is violated. The system can now decide how to deal with the situation after the failure: it can remove the queue item, or it can insert it into the table. The same situation occurs in the third case.

Underlying this approach is the requirement that it is possible to check for the presence of the data item in the non-transactional as well as transactional resource manager in order to determine if the data item is present after a failure. For example, in the second case above, after the systems comes back up, the following check has to occur: (a) for every item in the queue an equivalent items must be in the table and (b) for each item in the table there must be an item in the queue. If there is a data item in one, but not the other resource manager, the invariant is violated and the inconsistency has to be removed by either removing or adding the data item so that none or both of the resource managers have the data item (and therefore the invariant holds).

Database Fail-over Data Loss Recovery

When a database fails over, as discussed in earlier blogs, it is possible that no data loss occurs. In this case the transactional resource manager remains consistent with the non-transactional resource manager.

However, a data loss is possible. This means that one or more data items in the transactional resource manager that were present before the fail-over are not present anymore after the fail-over. In this case either the items need to be added to the transactional resource manager again (based on the content in the non-transactional resource manager), or they have to be removed from the non-transactional resource manager also.

The particular strategy is to be determined in a specific software architecture, however, with data duplication this choice can be made as needed and required by the application system at hand.

Effort vs. Consistency

The outlined approach above (which can be applied to any non-transactional resource manager that can check for the existence of data items) fundamentally makes the non-transactional resource manager pseudo transactional by pairing it up with a transactional resource manager. This is implementation effort made in order to provide data consistency.

Less effort might mean less consistency. For example, it would be sufficient to store the data item in a table only for the duration until the data item is safely enqueued in the queue. Once that is the case the data item could be removed from the table. While this would ensure the consistency during the enqueue operation, it does not ensure consistency during the dequeue operation since not every type of failure during a dequeue operation would be recoverable.

In a data loss situation because of a database fail-over there would be no way to reconcile the two resource managers if not all data is available in both at the same time. In a particular situation that might be acceptable, but in general it would not.

Effort vs. Performance/Throughput

Additional effort takes additional resource, both in storage space as well as processing time. The price for making the system more consistent is a possibly a slower system, with less throughput, and higher storage space requirements. Again, this is trading off non-functional with functional properties.

Summary

The outlined issues are present in all systems that deploy non-transactional persistent resource managers. The outlined solution is a generic one and in specific situations alternative, more efficient approaches might be possible in order to guarantee data consistency.

As discussed, the choice of non-transactional persistent resource managers paired with the need for data consistency can be expensive from an engineering and system resource perspective. While the choice is not always a real choice, it is definitely worth evaluating alternatives that provide the required functionality in a transactional resource manager for comparison.

This blog is the last in the series around transactional and non-transactional resource manager use in context of database fail-over. Clearly, the software architecture has to be aware of the fact that a database fail-over is possible and that a data loss might occur along the way due to network errors, increased network latency and system failures. Some of the architectural problems have been discussed, and some solutions presented as starting points for your specific application system development efforts.

Go SQL!

Appendix: Taxonomy

The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (“x” means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Software Design in Context of Database Failover (Part 4): Architecture on Transactional Persistent Resource Managers in Conjunction with Caches

If transactional resource managers are available, then all is good (see last blog). But what if this is not the case? This blogs starts discussing non-transactional resource managers.

Non-Transactional Resource Managers

There are different types of non-transactional resource managers (see the taxonomy in the appendix of this blog):

  • Non-transactional and persistent. Data is stored on persistent storage, but outside transaction protection.
  • Non-transactional, non-persistent and non-rebuildable. Data is managed in volatile memory, outside transaction protection, and if lost it cannot be rebuilt (it will be truly lost).
  • Non-transactional, non-persistent and rebuildable. Data is managed in volatile memory, outside transaction protection, but can be rebuilt if lost or intentionally deleted.

A non-persistent resource manager is usually called a cache. An example of a cache is a list of values in a Java singleton caching a subset of the data of a database table column. Another example is the user interface that caches a list of entries like sales opportunities and displays them to a user.

Proper implemented caches can be invalidated (aka, emptied out) and rebuilt based on data stored in one or several database tables as they are non-functional components put in place for performance reasons. Caches that can be invalidated are called “rebuildable caches”.

Caches that cannot be invalidated without loosing data are called “non-rebuildable caches”. Those should not be part of the software architecture and have to be avoided as lost data cannot be recovered. The impact of this is clear: you don’t want your checking account information be managed in such a cache.

In the following caches are discussed; the third type of resource manager, non-transactional and persistent, is going to be discussed in the next blog.

Rebuildable Cache and Database Fail-over

A rebuildable cache contains copies of data from the database. If a cache is accessed and it contains the requested data (“cache hit”), then the data is returned to the caller. If the cache does not contain the requested data (“cache miss”) the cache fetches the data first from the database and stores it before passing it on to the caller. In summary, the cache either has the data or knows how to get it from the database.

In context of a database fail-over it is possible that the primary and secondary database have the exact data set at the time of the fail-over [see blog https://realprogrammer.wordpress.com/2015/08/31/software-design-in-context-of-database-failover-part-2-software-architecture-taxonomy/]. In this case the cache contents remains consistent with the database state after a fail-over as all data in the cache has a corresponding data item in the database.

However, it is possible that a data loss occurs if the secondary database is lagging the primary database at the time of the fail-over. In this case some data might not be available anymore after the fail-over as this data was not replicated before the fail-over occurred.

A cache that has data that was available before the fail-over, but lost during the fail-over is therefore inconsistent with the database after the fail-over. A cache hit might occur, however, the cached data is not in the database. This would be incorrect behavior since the premise of a cache is that it caches data consistent with the database state.

The solution for the fail-over case is that all cache entries are invalidated (from all caches) right after the database fail-over took place before clients continue processing. Since each cache starts empty, many accesses will be a cache miss initially until the caches have built up the working set again. However, the caches will be consistent with the database state and the application system accesses consistent data.

From a software architecture approach is it very important to be able to know all implemented caches and ensure that each is a rebuildable cache.

Non-rebuildable Cache and Database Failover

Even if not desirable, non-rebuildable caches might exist and the cache invalidation functionality discussed above is not available. If the cache cannot be invalidated it is impossible to remove its contents after a database fail-over. Therefore it might be that the cache returns data to the client that is not present in the database anymore or not consistent with the data in the database. The client would have to be aware of this possibility and able to deal with such an inconsistency.

A special case needs to be discussed. It is possible that a non-rebuildable cache stores data from the database, but it is only missing the externally available functionality of cache invalidation. So in principle it could be rebuilt, but the rebuilt cannot be externally triggered due to a missing programming interface.

A possible approach in this case is to try to implement a workaround that is based on the assumption that a cache has limited capacity and that there is a lot more data in the database than the cache can hold. A brute force attempt to make the cache consistent again would be to implement a client after fail-over that requests every data item that is in the database from the cache. At some point the cache will be full without any capacity left and as a consequence the cache has to evict entries. As this happens, inconsistent entries will be removed to make room for those the client requested and caused a cache miss. Once all data was requested, the cache is consistent again.

Despite workarounds in special cases, there is no general approach that can address non-rebuildable caches. They have to be avoided if transactional consistency is important as consistency with these caches cannot be provided.

Summary

If caches are used in the software architecture, the best form is rebuildable caches that can recreate their state from persistent storage (aka, database) so that a consistent data state can be recreated after a loss-ful database fail-over. Those must have external interfaces so that an invalidation can be triggered right after the database fail-over completed.

Non-rebuildable caches might be managed in special cases with a workaround after a fail-over, however, as no general mechanism exists, they should be avoided altogether. Avoiding non-rebuildable caches is a safe architectural approach.

The next blog will discuss the third category of resource managers: non-transactional and persistent resource managers.

Go SQL!

Appendix: Taxonomy

The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (“x” means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Software Design in Context of Database Failover (Part 3): Architecture on Transactional Persistent Resource Managers

The “Happy Path” architecture used individual consistent database transactions. What about individual (database) transactions that are only consistent as a set (set-consistent individual transactions)?

Overview

This blog is about the first row of the software architecture taxonomy (see a reference to the taxonomy in the appendix at the end of this blog as well as the previous blog [https://realprogrammer.wordpress.com/2015/08/31/software-design-in-context-of-database-failover-part-2-software-architecture-taxonomy/]). There are two relevant aspects in context of transactions:

  • Transactional persistent resource managers
  • Set-consistent individual transactions

Transactional Persistent Resource Managers

In a general software architecture not only relational tables are accessed in a relational database management system but also queuing systems or file systems, among other storage options. In order to arrive to a consistent system state all operations must take place within transactions so that the operations either succeed successfully and a new consistent state is reached or fail and the system returns to a previous consistent state. As a consequence, a transactional queuing system or transactional file system is necessary for this type of software architecture if queues or files are used to manage data.

There are many choices for transactional resource managers. In context of the Oracle database a queuing system (Advanced Queuing, AQ) [https://docs.oracle.com/database/121/ADQUE/aq_intro.htm#ADQUE0100] is available within the RDBMS that supports queuing functionality. The queue operations can be part of local database transactions. The software architecture therefore can make use of queues as well as relational tables and operations of both can be within the same or different local database transactions. This means that any operation mix of relational operations as well as queue operations are consistent when guarded by transactions.

For illustration, a small code snippet follows that shows a table insert and an AQ enqueue operation. The changes of both operations occur when successfully committed, or both rollback when aborted. Even though a table and a queue operation are performed, both are governed by a single local database transaction (no two-phase-commit protocol required).

DECLARE
    enqueue_options dbms_aq.enqueue_options_t;
    message_properties dbms_aq.message_properties_t;
    message_handle RAW(16);
    MESSAGE aqadm.message_type;
  BEGIN
    enqueue_options.VISIBILITY         := DBMS_AQ.ON_COMMIT;
    enqueue_options.SEQUENCE_DEVIATION := NULL;
    -- -----
    message_properties.PRIORITY    := -5;
    message_properties.DELAY       := DBMS_AQ.NO_DELAY;
    message_properties.EXPIRATION  := DBMS_AQ.NEVER;
    message_properties.CORRELATION := 'TEST MESSAGE';
    -- -----
    MESSAGE := aqadm.message_type('blog', 
      'failover architecture');
    -- -----
    DBMS_AQ.ENQUEUE(queue_name => 'aqadm.messageQueue', 
      enqueue_options => enqueue_options, 
      message_properties => message_properties, 
      payload => MESSAGE , msgid => message_handle);
    -- ----
    INSERT INTO standardtext VALUES('data guard');
COMMIT;
END;

As a note, distributed transactions (for example, transactions implementing the XA protocol) are left out of the discussion in this blog as accomplishing a high-performance system is rather difficult and requires that all transactional resource managers are capable to participate in the two phase commit protocol.

Set-Consistent Individual Transactions

Not all individual transactions lead to a consistent application system state. Transactions always result in a consistent database state, but that does not mean that the application system state is consistent from the viewpoint of the application.

An example is the purchasing process on a web site of a company. The customer selects products and adds them to a cart and at some point the checkout process is started by the user. Adding products to the cart are individual transactions and each leads to a consistent state. The checkout process itself could be a single transaction, however, adding address and credit card information takes time and that duration is too long to keep one database transaction open. Therefore, the checkout process is a set of individual transactions (like adding a shipping address, entering a credit card number, selecting shipping methods) and only after all transactions are committed the checkout process itself is complete.

The company’s goal is to complete checkout processes completely as only then it can ship goods and charge for them. Any system failure and fail-over must not prevent the checkout process to complete. Therefore the application system is considered to be in a consistent state only when the checkout process is completed in its entirety. In terms of transactions, the whole set of transactions constituting the checkout process must be completed.

What happens if the system fails during the checkout process (meaning, some transactions are completed and other have yet to be completed) and must fail-over to the standby database? This will be discussed in a moment.

Database Replication For High Availability

High availability in context of databases is accomplished by database replication from a primary to a secondary database. If the primary database becomes unavailable the applications switch over to the secondary database. Ideally, the replication is “perfect” in the sense that every successful transaction is transmitted from the primary database to the secondary database instantaneously. If that is possible, no transaction would be lost after the fail-over from the primary to the secondary database.

Oracle provides the Active Data Guard [https://docs.oracle.com/database/121/SBYDB/concepts.htm#SBYDB00010] functionality for replication. It replicates data from the primary (“active”) database to the secondary (“standby”) database by means of different replication policies:

  • Maximum Availability
  • Maximum Performance
  • Maximum Protection

Maximum Protection actually achieves perfect replication. However, this relies on a synchronous communication of the primary database with the secondary database. As soon as this encounters problems like a significant delay or unavailability, the policy can change dynamically to Maximum Availability. In this case the primary database communicates asynchronously with the secondary database and in case of a failure in this situation data loss can happen if not all data was replicated from the primary database to the secondary database at the time of the failure.

Software Architecture Execution in Context of Database Fail-over

From an architecture viewpoint two data consistency cases have to be considered in case of fail-over:

  • no data loss: committed transactions on the secondary (new primary) are the same as in the primary (new secondary). The application system sees the exact same database state and can continue processing (after the application reset its database connections to the new primary).
  • data loss: some committed transactions are not present in the new primary. The application sees transaction as of some time ago; this is like a time-shift into the past. However, this time shift does not create an inconsistency from a data perspective.

From the viewpoint of transactions, also two cases have to be considered:

  • individual consistent transactions: the new primary has a consistent database and application system state. The application system can continue processing in both cases: data loss as well as no data loss.
  • set-consistent transactions: this is the more complex case and requires a more differentiated discussion next.

Before a fail-over, a group of set-consistent transactions can be completely finished and a consistent application system state is reached. It is also possible that only some of the transactions in a group of set-consistent transactions are executed (in-flight group of set-consistent transactions).

When a fail-over occurs, then there are different consequences depending on the data loss situation:

  • no data loss: if not data is lost during fail-over, then the group of set-consistent transactions sees the same state as before and processing can continue without interruption on the new primary database.
  • data loss: if data loss occurred then a group of set-consistent transactions that was completed before the fail-over might be set back to an in-flight state. In this case the application system will have to reprocess the lost transactions after the fail-over. A previously in-flight set-consistent transactions group might be set back also and some of the transactions have to be reprocessed in this case also.
  • pre-group data loss: in the worst case the data loss might be complete from the viewpoint of a group of set-consistent transactions in the sense the all transactions of a group of set-consistent transactions are lost. In this case there is no indication that the group ever existed.

From an architecture viewpoint, the case of an in-flight group of set-consistent transactions is the relevant one to focus on. In this case it is important that the application realizes that there is such a group and continues processing after a fail-over. Aside from the committed transactions there needs to be a persistent flag implemented that indicates the progress made and which transactions were completed, and which not yet. This persistent flag then allows the application system to realize that it needs to continue processing those transactions of a group of set-consistent transactions that have not been executed yet.

Since transactions are executed one after another sequentially the persistent flag only needs to record the last successful transaction. And the persistent flag is written as part of that transaction. So each transaction writes a persistent flag that indicates the last committed transaction. An application system can select all persistent flags after a fail-over and determines the transactions that need to be executed next (in this sense the application system is re-entrant).

If the data loss is pre-group, then there is no record of the group having ever existed. In this case the group will not be (re-)started on the new primary as there is no record of it. The application system either relies on user input (like in the above example: the user is irritated and starts the checkout process again) or, alternatively, a search can be performed looking for locations where a group can possibly started and that then would allow to prompt for user confirmation that a group of set-consistent transactions needs to be restarted (or not). For example, the user could be asked if the checkout process should be started.

Summary

If individual or set-consistent transactions rely solely on database transactions and on resource managers that are implemented in a database system (like Oracle) then it is possible to implement a system architecture to support fail-over in such a way that application systems can continue processing and deal with possibly lost transactions after fail-over. The reason for this is that application systems always encounter a consistent application system state and take this as the basis for continued processing after a fail-over.

Go SQL!

Appendix: Taxonomy

The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (“x” means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

 

Software Design in Context of Database Failover (Part 2): Software Architecture Taxonomy

In order to discuss software design in context of database fail-over I will present a taxonomy of software architectures that manage data. This is the basis for discussing all possible database fail-over cases and their specific issues. Let’s go.

Motivation

In the previous blog the ideal database software architecture (“Happy Path”; [https://realprogrammer.wordpress.com/2015/07/30/software-design-in-context-of-database-failover-part-1-the-happy-path/]) was discussed; why is this not sufficient to cover all types of software architectures? The reason lies in two main aspects:

  • Asynchronous replication mode. Ideally, replication from a primary database to a secondary database is synchronous and complete, meaning, the primary and secondary database are an exact replica of each other at every point in time (aka, at every transaction commit). However, this is not always possible due to network latency, network failures, server outages, etc. It is more often than not the case that the secondary database is lagging behind the primary database (if only for a few transactions) and is missing transactions when a failure occurs in such a situation. In case of a fail-over in such a situation applications will not see committed data in the secondary database that they have already seen committed in the primary database (a data loss occurred – or better: transaction loss).
  • Non-transactional resource management. Many applications manage data outside the scope of transactions. In case of failures these non-transactional data can be out of sync with the transactional data (e.g., primary database). And in case of data loss during a fail-over the non-transactional data might be inconsistent with the state of the secondary database also. Any replication mechanism between databases will not include non-transactional data that is managed outside the database.

A background discussion on resource managers and transactions follows now as preparation for the software architecture taxonomy discussion.

Background: Resource Manager

A resource manager in context of database software architectures is a component that manages state (data). In general, this is data creation, update, delete and read. For example, a database is a resource manager, and so is a file system. Resource managers are

  • Transactional or non-transactional. Transactional resource managers are able to participate in a transaction protocol (e.g., local transactions, distributed transactions, autonomous transactions or chained transactions, to name a few).
  • Persistent or non-persistent. For example, a database is (usually) persistent, a software cache is (usually) non-persistent.
  • Non-persistent rebuildable or non-persistent not rebuildable. A non-persistent resource manager might be able to rebuild data from persistently stored data.

For example, the file system on Linux is non-transactional, Oracle’s DBFS is transactional [http://docs.oracle.com/cd/E11882_01/appdev.112/e18294/adlob_fs.htm#ADLOB45943].

Resource Managers Are Everywhere

Resource managers are everywhere in a software architecture in general. Not only databases or file systems are resource managers, but also queuing systems, software caches, user interfaces, Java JVM caches, or Java class variables. Everything that manages state is to be considered a resource manager in context of the database fail-over discussion.

Most resource managers are non-transactional (they cannot participate in a transaction protocol), and many are non-persistent. An argument can be made that all resource managers could/should participate in transactions to avoid inconsistencies anywhere in the software stack (transactional memory would be an interesting tool in this regard). That would be ideal as any data state change would be consistent across all resource managers.

Resource Managers and Transactions

However, in reality, this is not the case: most resource managers are non-transactional and consequently the data that they manage are not under transaction control. In reality, access to resource managers happens within and outside transactions and not all resource managers can participate in a transaction protocol.

Worse, the same data item might be transactionally committed, and additionally stored in a non-transactional resource manager as a duplicate. In non-failure cases this works out fine, but sometimes, when failures occur, only one or the other update happens, leaving an inconsistent system state behind, as the data item is only in one location and not duplicated in others as expected. If an assumption is made in the software architecture that the data in all resource managers is always consistent, then this situation might cause significant system degradation or errors when failures occur. In the worst case inconsistent application state is left behind.

The same situation occurs if the application writes two data items to be consistent, on through a transactional resource manager, and one through a non-transactional resource manager. If a failure occurs after the first update, the second will not happen and the state is inconsistent.

System Semantics: Consistency

For the purposes of this blog the system semantics wrt. state consistency can be reduced to simply two cases:

  • Consistent system: The overall system is consistent when all data in resource managers are consistent with each other (or can be made consistent after a failure before resuming processing)
  • Inconsistent system: Otherwise.

The following system consistency guarantees can be given.

  • Consistent
    • Only transactional and persistent resource managers are used and all changes happen in context of a transaction protocol. This is the Happy Path case from the previous blog as the complete state is consistent at all times.
    • If additionally non-persistent and rebuildable resource managers are used, a consistent state can be re-created by removing all data from the non-persistent and rebuildable resource managers.
  • Possibly consistent
    • Transactional and non-transactional persistent resource managers are used. This might result in consistent state changes, or might not, depending on the timing of the failure. Because of the persistent nature it might be non-trivial to clean-up persistent inconsistent state and possibly complex compensation functions have to be developed (if at all possible) to clean up the inconsistent state.
    • Non-persistent and non-rebuildable resource managers are used in conjunction with persistent resource managers. Like before, this might be non-trivial to clean-up and an inconsistent state might be left behind.

If several transactional resource managers are used to change state consistently, but cannot be invoked in context of a single transaction in one transaction protocol, then any failure might leave an inconsistent state behind. This must be called out explicitly here in order to avoid the assumption that transactional means consistent in itself.

Taxonomy

The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (X means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x
Possibly consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x

Each of these will be discussed separately in an upcoming blog.

Fail-over

In terms of database fail-over, only those software architectures can fail-over without any state inconsistency that can maintain a consistent state across all data or can rebuild a consistent state after a failure. The reason for this is that only consistent state is replicated and any state outside the database is consistent also.

Summary

The rationalization in this blog shows that software applications will have to deal with inconsistent data states once non-transactional resource managers are used to manage data. In context of fail-over this becomes an even bigger problem and the next blogs will discuss strategies to deal with data state inconsistencies in software architectures.

This was a high-level discussion only and future blogs will discuss each software architecture in more detail to shine a light on the important and interesting details in case of database fail-over.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

 

Software Design in Context of Database Failover (Part 1): The Happy Path

Oracle Active Data Guard provides a replication and fail-over mechanism between two or more Oracle databases for high availability and disaster recovery. How does (application) software have to be designed in this context?

Oracle Active Data Guard

In a nutshell, Oracle Active Data Guard [http://www.oracle.com/us/products/database/options/active-data-guard/overview/index.html] provides synchronization between a primary database and one or more secondary databases. It continuously replicates the primary database to all secondary databases based on in-memory redo logs and their transmission. There are several protection modes, which are variations of synchronous or asynchronous replication between the primary database and the secondary databases. Secondary databases can be in the same data center (high availability) or different data centers (disaster recovery) and the specific deployment topology depends on the degree of availability and protection the organization implements.

Because the primary database is replicated to secondary databases, clients accessing the primary database can fail-over to one of the secondary databases if the primary database becomes unavailable. Clients therefore switch over from the primary database to the secondary database for processing. Initiating the fail-over process can be manually or automatically triggered, depending on the configuration chosen. In case of synchronous replication the secondary databases are in lockstep with the primary database and contain the exact same state. In case of asynchronous replication a minor delay might be possible between the primary database and its secondary databases and the secondary databases might lag a few transactions behind; their state is therefore not the same but consistent as of a earlier time.

Synchronous Replication

Active Data Guards’ synchronous replication ensures that the secondary databases are an exact replicas of the primary database. In terms of transaction execution this means that a transaction is committed successfully on the primary database only if it is committed on all secondary databases as well. The benefit of exact replication (aka, no data loss) comes with the cost of transaction execution time as a transaction on the primary database only succeeds after the secondary databases have executed the same transaction. The primary database is therefore dependent on the execution behavior of the secondary databases.

Fail-over Scenario

In the following a partial site failure is assumed. This means that the primary database becomes unavailable, however, the applications that are accessing the primary database continue to be available. A (complete) site failure occurs if all systems, primary database and all applications, are unavailable.

The following shows the various components (a primary database, a secondary database, and one application). The solid arcs represent the call relationships before a fail-over to the secondary database, the dashed arcs represent the call relationships after a fail-over to the secondary database.

failover

Software Design for “Happy Path”

In this blog the “Happy Path” is assumed. This means that all database processing that the application performs is done in individual complete OLTP transactions that always commit consistent transactions. After each transaction the database is in a consistent state.

In the case of complete OLTP transactions it is assumed that no data is cached in application layer caches or stored in other non-transactional resource managers, like file systems or queuing systems. This style of OLTP transaction processing application serves as a design baseline for this blog and the future blogs in this series.

For the complete OLTP transaction applications no specific software design rules apply for the fail-over scenario outlined above (except for not using any other non-transactionl resource managers like caches, queues or file systems). Since all transactions are committing consistent data, and since the replication mode between primary database and secondary database is synchronous, the application will find a consistent and complete database state in the secondary database after a fail-over.

Furthermore, Oracle Active Data Guard provides driver-level functionality that allows an application to fail-over without having to implement application level code that accomplishes the fail-over; instead, the drivers are able to fail-over without the application having to be aware of it.

Summary

For complete OLTP processing no specific software design approaches or software design rules have to be taken into consideration when using the synchronous replication mode in context of Oracle Active Data Guard.

However, the “Happy Path” is not the only possible application type and more complex application software systems in conjunction with other Oracle Active Data Guard replication modes will be discussed here in forthcoming blogs, especially when non-transactional resource managers are involved.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.