Category
5 min read

High Availability in SQL: A Guide to High Availability Databases

We want our applications to never fail. To achieve that, we need to guarantee that the data storage is always on. How to do that with SQL databases? How to make sure that the engine never fails? And how to scale it to get better performance?
Published on
May 9, 2023
Share this post
Contributors
Metis Team
Adam Furmanek
Dev Rel
Start using Metis database guardrail for free. Make your database 3x faster and 50% cheaper!

Introduction

High Availability (HA) means that the database is available always - 24 hours a day, 7 days a week. No matter if we plan the maintenance, no matter what the load is, no matter how many clients we have - the database must be on. SQL High availability (HA) consists of specific technologies, custom solutions, and processes around that.

One might find it surprising. How are we going to take the system down for maintenance and yet keep it running 24/7? There is no magic here, if the system needs to go down, then we need to have some other system that is up and running. Basically, we need to have multiple copies of the database turned on, so when one of the copies goes down, then some other copy can take the load.

However, things may get much harder when we scale things out. We would like to have the data storage as close to the users as possible. We’d like to have the database in the US for the US-based users, and the same for Europe or Australia. If we regionalize our storage like this, then we may face issues when a single entity is modified between continents. We somehow need to decide how to resolve conflicts.

Let’s see how to achieve all of that.

Glossary

Before moving on, let us define some terms.

  • Master node
    Primary node that controls the communication and servers as a source of truth. Also called primary, source, sender, coordinator.
  • Replica
    Secondary node that follows the master but doesn’t decide on anything. Also called secondary, follower, slave, receiver.
  • Failover
    Process in which a new master node is selected.
  • WAL (Write-ahead Logging, WAL)
    Technique in which we first log the changes to the log file, and then we apply them to the real data.
  • Read-only replica
    Node which allows to execute read-only queries and cannot modify the data.
  • Synchronous operation
    Operation that reports its status (success or failure) only after it completes at all nodes.
  • Asynchronous operation
    Operation that returns before all internal tasks are complete.

Techniques

HA can be guaranteed with multiple approaches, like:

  • WAL replication
  • Data storage share
  • 2-Phase Commit (2PC)
  • 3-Phase Commit (3PC)

We’ll cover some of them in this blog post.

WAL replication

WAL replication technique is based on the idea that we can send the Write-ahead Log from one server  to another. This way the sender can decide if a given transaction should be executed, next the WAL is sent to the receiver, and then the receiver can apply logically the same changes to the data storage.

There are multiple ways to replicate WAL. We can stream the logs over a TCP/IP connection. In this approach logs are sent as they appear, which minimizes the latency between applying changes in the primary and the secondary nodes. This requires a direct network connection between nodes, and may get unreliable if the connection breaks.

Another way is to use the so-called log shipping. In this approach log files are sent from the primary to the secondary periodically over any suitable channel. This could be SCP, FTP, shared drive, cloud blob storage, etc. This introduces higher latency, but allows the replica to catch up over a longer period of time.

Another approach is to mirror the data storage to log files consistent. This allows to delegate the log shipping to some other infrastructure, so the database can just “assume” files are up to date. This can be achieved via mirroring or sharing a network storage.

It’s possible to use all these approaches at the same time. This way we can minimize the latency (thanks to log streaming), and provide reliably in case of failures (thanks to log shipping). If we can use only one way, then log shipping provides higher reliability in general.

Depending on the database engine we use, there may be many more flavors of the WAL replication. They can use NAS, shared drives, network connections, middleware agents, etc. For example, see PostgreSQL documentation.

By default the replication works in an asynchronous mode which means that the transaction is committed by the primary server, and then the logs are sent to the replica. WAL replication can be also configured in a synchronous mode in which the log is sent to the replicas, and the transaction is confirmed only after the replicas receive the changes.

Data storage share

Data storage share approach works by sharing the same data storage between database instances. This can be as simple as just connecting to the same storage physically (over a wire or a drive interface), or much more complex with shared DNS name, IP address, operating system, etc.

Not all databases support this approach. Some engines need additional configuration on the operating system level infrastructure, for instance Windows Server Failover Cluster.

This approach is highly configurable and allows for easy and fast failover. However, it may require that only one host is a master, and all other hosts are read-only replicas.

2-Phase Commit and 3-Phase Commit

2-Phase Commit (2PC) and 3-Phase Commit (3PC) are protocols for running distributed transactions. They work on the idea that changes to the database are committed atomically across all databases. General idea is:

  • There is a coordinator that knows what changes to introduce
  • The coordinator opens up a transaction on each of the nodes
  • The coordinator sends the changes
  • Each node prepares the changes and answers to the coordinator whether the transaction can be committed or not
  • The coordinator makes the final decision - either to commit the transaction or roll it back

This protocol ensures that the changes are applied atomically to all the databases. 3PC changes in the way how the catastrophic outage is handled to improve the reliability.

However, both 2PC and 3PC introduce additional roundtrips over the wire that may reduce the performance significantly. However, they provide great features of ACID across distributed databases.

2PC can be used in PostgreSQL with the Prepare Transaction statement. The typical flow is:

BEGIN;
-- Do the things
PREPARE TRANSACTION 't1';
-- Decide if commit or rollback
COMMIT PREPARED 't1' || ROLLBACK PREPARED 't1'

How can Metis help?

Metis provides an observability dashboard that can track and follow your performance. It monitors your configuration, extensions, active queries, indexes, tables, daemon processes and much more.

Metis database observability dashboard

Metis can keep your databases always on and notify you when issues emerge. You can use it to validate your configuration and observe it over time to make sure that things work as expected.

Summary

Data storage is the most crucial component of our systems. We can’t let it go down, we need to make sure it works as expected and is always available. There are multiple ways to guarantee the High Availability, based on log shipping, data sharing, or even distributing transactions. Metis provides the ultimate monitoring solution to make sure your databases are always on. Sign up for free to test Metis now or book a demo with one of our reps for a walkthrough.

This is some text inside of a div block. This is some text inside of a div block. This is some text inside of a div block. This is some text inside of a div block. This is some text inside of a div block.

Ready to take your database to the next level?

Start using Metis and get your database observability set up in minutes