Software Design in Context of Database Failover (Part 3): Architecture on Transactional Persistent Resource Managers

The “Happy Path” architecture used individual consistent database transactions. What about individual (database) transactions that are only consistent as a set (set-consistent individual transactions)?


This blog is about the first row of the software architecture taxonomy (see a reference to the taxonomy in the appendix at the end of this blog as well as the previous blog []). There are two relevant aspects in context of transactions:

  • Transactional persistent resource managers
  • Set-consistent individual transactions

Transactional Persistent Resource Managers

In a general software architecture not only relational tables are accessed in a relational database management system but also queuing systems or file systems, among other storage options. In order to arrive to a consistent system state all operations must take place within transactions so that the operations either succeed successfully and a new consistent state is reached or fail and the system returns to a previous consistent state. As a consequence, a transactional queuing system or transactional file system is necessary for this type of software architecture if queues or files are used to manage data.

There are many choices for transactional resource managers. In context of the Oracle database a queuing system (Advanced Queuing, AQ) [] is available within the RDBMS that supports queuing functionality. The queue operations can be part of local database transactions. The software architecture therefore can make use of queues as well as relational tables and operations of both can be within the same or different local database transactions. This means that any operation mix of relational operations as well as queue operations are consistent when guarded by transactions.

For illustration, a small code snippet follows that shows a table insert and an AQ enqueue operation. The changes of both operations occur when successfully committed, or both rollback when aborted. Even though a table and a queue operation are performed, both are governed by a single local database transaction (no two-phase-commit protocol required).

    enqueue_options dbms_aq.enqueue_options_t;
    message_properties dbms_aq.message_properties_t;
    message_handle RAW(16);
    MESSAGE aqadm.message_type;
    enqueue_options.VISIBILITY         := DBMS_AQ.ON_COMMIT;
    enqueue_options.SEQUENCE_DEVIATION := NULL;
    -- -----
    message_properties.PRIORITY    := -5;
    message_properties.DELAY       := DBMS_AQ.NO_DELAY;
    message_properties.EXPIRATION  := DBMS_AQ.NEVER;
    message_properties.CORRELATION := 'TEST MESSAGE';
    -- -----
    MESSAGE := aqadm.message_type('blog', 
      'failover architecture');
    -- -----
    DBMS_AQ.ENQUEUE(queue_name => 'aqadm.messageQueue', 
      enqueue_options => enqueue_options, 
      message_properties => message_properties, 
      payload => MESSAGE , msgid => message_handle);
    -- ----
    INSERT INTO standardtext VALUES('data guard');

As a note, distributed transactions (for example, transactions implementing the XA protocol) are left out of the discussion in this blog as accomplishing a high-performance system is rather difficult and requires that all transactional resource managers are capable to participate in the two phase commit protocol.

Set-Consistent Individual Transactions

Not all individual transactions lead to a consistent application system state. Transactions always result in a consistent database state, but that does not mean that the application system state is consistent from the viewpoint of the application.

An example is the purchasing process on a web site of a company. The customer selects products and adds them to a cart and at some point the checkout process is started by the user. Adding products to the cart are individual transactions and each leads to a consistent state. The checkout process itself could be a single transaction, however, adding address and credit card information takes time and that duration is too long to keep one database transaction open. Therefore, the checkout process is a set of individual transactions (like adding a shipping address, entering a credit card number, selecting shipping methods) and only after all transactions are committed the checkout process itself is complete.

The company’s goal is to complete checkout processes completely as only then it can ship goods and charge for them. Any system failure and fail-over must not prevent the checkout process to complete. Therefore the application system is considered to be in a consistent state only when the checkout process is completed in its entirety. In terms of transactions, the whole set of transactions constituting the checkout process must be completed.

What happens if the system fails during the checkout process (meaning, some transactions are completed and other have yet to be completed) and must fail-over to the standby database? This will be discussed in a moment.

Database Replication For High Availability

High availability in context of databases is accomplished by database replication from a primary to a secondary database. If the primary database becomes unavailable the applications switch over to the secondary database. Ideally, the replication is “perfect” in the sense that every successful transaction is transmitted from the primary database to the secondary database instantaneously. If that is possible, no transaction would be lost after the fail-over from the primary to the secondary database.

Oracle provides the Active Data Guard [] functionality for replication. It replicates data from the primary (“active”) database to the secondary (“standby”) database by means of different replication policies:

  • Maximum Availability
  • Maximum Performance
  • Maximum Protection

Maximum Protection actually achieves perfect replication. However, this relies on a synchronous communication of the primary database with the secondary database. As soon as this encounters problems like a significant delay or unavailability, the policy can change dynamically to Maximum Availability. In this case the primary database communicates asynchronously with the secondary database and in case of a failure in this situation data loss can happen if not all data was replicated from the primary database to the secondary database at the time of the failure.

Software Architecture Execution in Context of Database Fail-over

From an architecture viewpoint two data consistency cases have to be considered in case of fail-over:

  • no data loss: committed transactions on the secondary (new primary) are the same as in the primary (new secondary). The application system sees the exact same database state and can continue processing (after the application reset its database connections to the new primary).
  • data loss: some committed transactions are not present in the new primary. The application sees transaction as of some time ago; this is like a time-shift into the past. However, this time shift does not create an inconsistency from a data perspective.

From the viewpoint of transactions, also two cases have to be considered:

  • individual consistent transactions: the new primary has a consistent database and application system state. The application system can continue processing in both cases: data loss as well as no data loss.
  • set-consistent transactions: this is the more complex case and requires a more differentiated discussion next.

Before a fail-over, a group of set-consistent transactions can be completely finished and a consistent application system state is reached. It is also possible that only some of the transactions in a group of set-consistent transactions are executed (in-flight group of set-consistent transactions).

When a fail-over occurs, then there are different consequences depending on the data loss situation:

  • no data loss: if not data is lost during fail-over, then the group of set-consistent transactions sees the same state as before and processing can continue without interruption on the new primary database.
  • data loss: if data loss occurred then a group of set-consistent transactions that was completed before the fail-over might be set back to an in-flight state. In this case the application system will have to reprocess the lost transactions after the fail-over. A previously in-flight set-consistent transactions group might be set back also and some of the transactions have to be reprocessed in this case also.
  • pre-group data loss: in the worst case the data loss might be complete from the viewpoint of a group of set-consistent transactions in the sense the all transactions of a group of set-consistent transactions are lost. In this case there is no indication that the group ever existed.

From an architecture viewpoint, the case of an in-flight group of set-consistent transactions is the relevant one to focus on. In this case it is important that the application realizes that there is such a group and continues processing after a fail-over. Aside from the committed transactions there needs to be a persistent flag implemented that indicates the progress made and which transactions were completed, and which not yet. This persistent flag then allows the application system to realize that it needs to continue processing those transactions of a group of set-consistent transactions that have not been executed yet.

Since transactions are executed one after another sequentially the persistent flag only needs to record the last successful transaction. And the persistent flag is written as part of that transaction. So each transaction writes a persistent flag that indicates the last committed transaction. An application system can select all persistent flags after a fail-over and determines the transactions that need to be executed next (in this sense the application system is re-entrant).

If the data loss is pre-group, then there is no record of the group having ever existed. In this case the group will not be (re-)started on the new primary as there is no record of it. The application system either relies on user input (like in the above example: the user is irritated and starts the checkout process again) or, alternatively, a search can be performed looking for locations where a group can possibly started and that then would allow to prompt for user confirmation that a group of set-consistent transactions needs to be restarted (or not). For example, the user could be asked if the checkout process should be started.


If individual or set-consistent transactions rely solely on database transactions and on resource managers that are implemented in a database system (like Oracle) then it is possible to implement a system architecture to support fail-over in such a way that application systems can continue processing and deal with possibly lost transactions after fail-over. The reason for this is that application systems always encounter a consistent application system state and take this as the basis for continued processing after a fail-over.


Appendix: Taxonomy

The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (“x” means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x


The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.



Software Design in Context of Database Failover (Part 2): Software Architecture Taxonomy

In order to discuss software design in context of database fail-over I will present a taxonomy of software architectures that manage data. This is the basis for discussing all possible database fail-over cases and their specific issues. Let’s go.


In the previous blog the ideal database software architecture (“Happy Path”; []) was discussed; why is this not sufficient to cover all types of software architectures? The reason lies in two main aspects:

  • Asynchronous replication mode. Ideally, replication from a primary database to a secondary database is synchronous and complete, meaning, the primary and secondary database are an exact replica of each other at every point in time (aka, at every transaction commit). However, this is not always possible due to network latency, network failures, server outages, etc. It is more often than not the case that the secondary database is lagging behind the primary database (if only for a few transactions) and is missing transactions when a failure occurs in such a situation. In case of a fail-over in such a situation applications will not see committed data in the secondary database that they have already seen committed in the primary database (a data loss occurred – or better: transaction loss).
  • Non-transactional resource management. Many applications manage data outside the scope of transactions. In case of failures these non-transactional data can be out of sync with the transactional data (e.g., primary database). And in case of data loss during a fail-over the non-transactional data might be inconsistent with the state of the secondary database also. Any replication mechanism between databases will not include non-transactional data that is managed outside the database.

A background discussion on resource managers and transactions follows now as preparation for the software architecture taxonomy discussion.

Background: Resource Manager

A resource manager in context of database software architectures is a component that manages state (data). In general, this is data creation, update, delete and read. For example, a database is a resource manager, and so is a file system. Resource managers are

  • Transactional or non-transactional. Transactional resource managers are able to participate in a transaction protocol (e.g., local transactions, distributed transactions, autonomous transactions or chained transactions, to name a few).
  • Persistent or non-persistent. For example, a database is (usually) persistent, a software cache is (usually) non-persistent.
  • Non-persistent rebuildable or non-persistent not rebuildable. A non-persistent resource manager might be able to rebuild data from persistently stored data.

For example, the file system on Linux is non-transactional, Oracle’s DBFS is transactional [].

Resource Managers Are Everywhere

Resource managers are everywhere in a software architecture in general. Not only databases or file systems are resource managers, but also queuing systems, software caches, user interfaces, Java JVM caches, or Java class variables. Everything that manages state is to be considered a resource manager in context of the database fail-over discussion.

Most resource managers are non-transactional (they cannot participate in a transaction protocol), and many are non-persistent. An argument can be made that all resource managers could/should participate in transactions to avoid inconsistencies anywhere in the software stack (transactional memory would be an interesting tool in this regard). That would be ideal as any data state change would be consistent across all resource managers.

Resource Managers and Transactions

However, in reality, this is not the case: most resource managers are non-transactional and consequently the data that they manage are not under transaction control. In reality, access to resource managers happens within and outside transactions and not all resource managers can participate in a transaction protocol.

Worse, the same data item might be transactionally committed, and additionally stored in a non-transactional resource manager as a duplicate. In non-failure cases this works out fine, but sometimes, when failures occur, only one or the other update happens, leaving an inconsistent system state behind, as the data item is only in one location and not duplicated in others as expected. If an assumption is made in the software architecture that the data in all resource managers is always consistent, then this situation might cause significant system degradation or errors when failures occur. In the worst case inconsistent application state is left behind.

The same situation occurs if the application writes two data items to be consistent, on through a transactional resource manager, and one through a non-transactional resource manager. If a failure occurs after the first update, the second will not happen and the state is inconsistent.

System Semantics: Consistency

For the purposes of this blog the system semantics wrt. state consistency can be reduced to simply two cases:

  • Consistent system: The overall system is consistent when all data in resource managers are consistent with each other (or can be made consistent after a failure before resuming processing)
  • Inconsistent system: Otherwise.

The following system consistency guarantees can be given.

  • Consistent
    • Only transactional and persistent resource managers are used and all changes happen in context of a transaction protocol. This is the Happy Path case from the previous blog as the complete state is consistent at all times.
    • If additionally non-persistent and rebuildable resource managers are used, a consistent state can be re-created by removing all data from the non-persistent and rebuildable resource managers.
  • Possibly consistent
    • Transactional and non-transactional persistent resource managers are used. This might result in consistent state changes, or might not, depending on the timing of the failure. Because of the persistent nature it might be non-trivial to clean-up persistent inconsistent state and possibly complex compensation functions have to be developed (if at all possible) to clean up the inconsistent state.
    • Non-persistent and non-rebuildable resource managers are used in conjunction with persistent resource managers. Like before, this might be non-trivial to clean-up and an inconsistent state might be left behind.

If several transactional resource managers are used to change state consistently, but cannot be invoked in context of a single transaction in one transaction protocol, then any failure might leave an inconsistent state behind. This must be called out explicitly here in order to avoid the assumption that transactional means consistent in itself.


The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (X means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x
Possibly consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x

Each of these will be discussed separately in an upcoming blog.


In terms of database fail-over, only those software architectures can fail-over without any state inconsistency that can maintain a consistent state across all data or can rebuild a consistent state after a failure. The reason for this is that only consistent state is replicated and any state outside the database is consistent also.


The rationalization in this blog shows that software applications will have to deal with inconsistent data states once non-transactional resource managers are used to manage data. In context of fail-over this becomes an even bigger problem and the next blogs will discuss strategies to deal with data state inconsistencies in software architectures.

This was a high-level discussion only and future blogs will discuss each software architecture in more detail to shine a light on the important and interesting details in case of database fail-over.



The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.