Software Design in Context of Database Failover (Part 4): Architecture on Transactional Persistent Resource Managers in Conjunction with Caches

If transactional resource managers are available, then all is good (see last blog). But what if this is not the case? This blogs starts discussing non-transactional resource managers.

Non-Transactional Resource Managers

There are different types of non-transactional resource managers (see the taxonomy in the appendix of this blog):

  • Non-transactional and persistent. Data is stored on persistent storage, but outside transaction protection.
  • Non-transactional, non-persistent and non-rebuildable. Data is managed in volatile memory, outside transaction protection, and if lost it cannot be rebuilt (it will be truly lost).
  • Non-transactional, non-persistent and rebuildable. Data is managed in volatile memory, outside transaction protection, but can be rebuilt if lost or intentionally deleted.

A non-persistent resource manager is usually called a cache. An example of a cache is a list of values in a Java singleton caching a subset of the data of a database table column. Another example is the user interface that caches a list of entries like sales opportunities and displays them to a user.

Proper implemented caches can be invalidated (aka, emptied out) and rebuilt based on data stored in one or several database tables as they are non-functional components put in place for performance reasons. Caches that can be invalidated are called “rebuildable caches”.

Caches that cannot be invalidated without loosing data are called “non-rebuildable caches”. Those should not be part of the software architecture and have to be avoided as lost data cannot be recovered. The impact of this is clear: you don’t want your checking account information be managed in such a cache.

In the following caches are discussed; the third type of resource manager, non-transactional and persistent, is going to be discussed in the next blog.

Rebuildable Cache and Database Fail-over

A rebuildable cache contains copies of data from the database. If a cache is accessed and it contains the requested data (“cache hit”), then the data is returned to the caller. If the cache does not contain the requested data (“cache miss”) the cache fetches the data first from the database and stores it before passing it on to the caller. In summary, the cache either has the data or knows how to get it from the database.

In context of a database fail-over it is possible that the primary and secondary database have the exact data set at the time of the fail-over [see blog https://realprogrammer.wordpress.com/2015/08/31/software-design-in-context-of-database-failover-part-2-software-architecture-taxonomy/]. In this case the cache contents remains consistent with the database state after a fail-over as all data in the cache has a corresponding data item in the database.

However, it is possible that a data loss occurs if the secondary database is lagging the primary database at the time of the fail-over. In this case some data might not be available anymore after the fail-over as this data was not replicated before the fail-over occurred.

A cache that has data that was available before the fail-over, but lost during the fail-over is therefore inconsistent with the database after the fail-over. A cache hit might occur, however, the cached data is not in the database. This would be incorrect behavior since the premise of a cache is that it caches data consistent with the database state.

The solution for the fail-over case is that all cache entries are invalidated (from all caches) right after the database fail-over took place before clients continue processing. Since each cache starts empty, many accesses will be a cache miss initially until the caches have built up the working set again. However, the caches will be consistent with the database state and the application system accesses consistent data.

From a software architecture approach is it very important to be able to know all implemented caches and ensure that each is a rebuildable cache.

Non-rebuildable Cache and Database Failover

Even if not desirable, non-rebuildable caches might exist and the cache invalidation functionality discussed above is not available. If the cache cannot be invalidated it is impossible to remove its contents after a database fail-over. Therefore it might be that the cache returns data to the client that is not present in the database anymore or not consistent with the data in the database. The client would have to be aware of this possibility and able to deal with such an inconsistency.

A special case needs to be discussed. It is possible that a non-rebuildable cache stores data from the database, but it is only missing the externally available functionality of cache invalidation. So in principle it could be rebuilt, but the rebuilt cannot be externally triggered due to a missing programming interface.

A possible approach in this case is to try to implement a workaround that is based on the assumption that a cache has limited capacity and that there is a lot more data in the database than the cache can hold. A brute force attempt to make the cache consistent again would be to implement a client after fail-over that requests every data item that is in the database from the cache. At some point the cache will be full without any capacity left and as a consequence the cache has to evict entries. As this happens, inconsistent entries will be removed to make room for those the client requested and caused a cache miss. Once all data was requested, the cache is consistent again.

Despite workarounds in special cases, there is no general approach that can address non-rebuildable caches. They have to be avoided if transactional consistency is important as consistency with these caches cannot be provided.

Summary

If caches are used in the software architecture, the best form is rebuildable caches that can recreate their state from persistent storage (aka, database) so that a consistent data state can be recreated after a loss-ful database fail-over. Those must have external interfaces so that an invalidation can be triggered right after the database fail-over completed.

Non-rebuildable caches might be managed in special cases with a workaround after a fail-over, however, as no general mechanism exists, they should be avoided altogether. Avoiding non-rebuildable caches is a safe architectural approach.

The next blog will discuss the third category of resource managers: non-transactional and persistent resource managers.

Go SQL!

Appendix: Taxonomy

The software architecture taxonomy relevant for database fail-over can be built based on the combinations of resource manager types used. In the following the various combinations are discussion on a high level (“x” means that the software architecture uses one or more of the indicated resource manager types).

Software Architecture Transactional Persistent Non-transactional Persistent Non-transactional and Non-persistent and rebuildable Non-transactional and Non-persistent and Non-rebuildable
Consistent x
Consistent x x
Possibly consistent x x
Possibly consistent x x x
Possibly consistent x x x x

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Advertisements