Software Design in Context of Database Failover (Part 1): The Happy Path

Oracle Active Data Guard provides a replication and fail-over mechanism between two or more Oracle databases for high availability and disaster recovery. How does (application) software have to be designed in this context?

Oracle Active Data Guard

In a nutshell, Oracle Active Data Guard [http://www.oracle.com/us/products/database/options/active-data-guard/overview/index.html] provides synchronization between a primary database and one or more secondary databases. It continuously replicates the primary database to all secondary databases based on in-memory redo logs and their transmission. There are several protection modes, which are variations of synchronous or asynchronous replication between the primary database and the secondary databases. Secondary databases can be in the same data center (high availability) or different data centers (disaster recovery) and the specific deployment topology depends on the degree of availability and protection the organization implements.

Because the primary database is replicated to secondary databases, clients accessing the primary database can fail-over to one of the secondary databases if the primary database becomes unavailable. Clients therefore switch over from the primary database to the secondary database for processing. Initiating the fail-over process can be manually or automatically triggered, depending on the configuration chosen. In case of synchronous replication the secondary databases are in lockstep with the primary database and contain the exact same state. In case of asynchronous replication a minor delay might be possible between the primary database and its secondary databases and the secondary databases might lag a few transactions behind; their state is therefore not the same but consistent as of a earlier time.

Synchronous Replication

Active Data Guards’ synchronous replication ensures that the secondary databases are an exact replicas of the primary database. In terms of transaction execution this means that a transaction is committed successfully on the primary database only if it is committed on all secondary databases as well. The benefit of exact replication (aka, no data loss) comes with the cost of transaction execution time as a transaction on the primary database only succeeds after the secondary databases have executed the same transaction. The primary database is therefore dependent on the execution behavior of the secondary databases.

Fail-over Scenario

In the following a partial site failure is assumed. This means that the primary database becomes unavailable, however, the applications that are accessing the primary database continue to be available. A (complete) site failure occurs if all systems, primary database and all applications, are unavailable.

The following shows the various components (a primary database, a secondary database, and one application). The solid arcs represent the call relationships before a fail-over to the secondary database, the dashed arcs represent the call relationships after a fail-over to the secondary database.

failover

Software Design for “Happy Path”

In this blog the “Happy Path” is assumed. This means that all database processing that the application performs is done in individual complete OLTP transactions that always commit consistent transactions. After each transaction the database is in a consistent state.

In the case of complete OLTP transactions it is assumed that no data is cached in application layer caches or stored in other non-transactional resource managers, like file systems or queuing systems. This style of OLTP transaction processing application serves as a design baseline for this blog and the future blogs in this series.

For the complete OLTP transaction applications no specific software design rules apply for the fail-over scenario outlined above (except for not using any other non-transactionl resource managers like caches, queues or file systems). Since all transactions are committing consistent data, and since the replication mode between primary database and secondary database is synchronous, the application will find a consistent and complete database state in the secondary database after a fail-over.

Furthermore, Oracle Active Data Guard provides driver-level functionality that allows an application to fail-over without having to implement application level code that accomplishes the fail-over; instead, the drivers are able to fail-over without the application having to be aware of it.

Summary

For complete OLTP processing no specific software design approaches or software design rules have to be taken into consideration when using the synchronous replication mode in context of Oracle Active Data Guard.

However, the “Happy Path” is not the only possible application type and more complex application software systems in conjunction with other Oracle Active Data Guard replication modes will be discussed here in forthcoming blogs, especially when non-transactional resource managers are involved.

Go SQL!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.