Chapter 8. Database servers
8.1. Introduction to database servers
A database server is a hardware device which has a certain amount of main memory, and a database (DB) application installed. This DB application provides services as a means of writing the cached data from the main memory, which is usually small and expensive, to DB files (database). These services are provided to multiple clients on a network. There can be as many DB servers as a machine’s main memory and storage allows.
Red Hat Enterprise Linux 8 provides the following database applications:
- MariaDB 10.3
- MySQL 8.0
- PostgreSQL 10
- PostgreSQL 9.6
- PostgreSQL 12 - available since RHEL 8.1.1
8.2. Using MariaDB
8.2.1. Getting started with MariaDB
The MariaDB server is an open source fast and robust database server that is based on MySQL technology.
MariaDB is a relational database which converts data into structured information and provides an SQL interface for accessing data. It includes multiple storage engines and plug-ins, as well as geographic information system (GIS) and JavaScript Object Notation (JSON) features.
This section describes how to install the MariaDB server in Installing MariaDB or how to migrate from the Red Hat Enterprise Linux 7 default version, MariaDB 5.5, to the Red Hat Enterprise Linux 8 default version, MariaDB 10.3, in Migrating to MariaDB 10.3, and also how to back up MariaDB data. Performing data backup is one of the prerequisites for MariaDB migration.
8.2.2. Installing MariaDB
To install MariaDB, follow this procedure:
Ensure that all necessary packages for MariaDB server are available on the system by installing the
mariadb
module using a specific stream:# yum module install mariadb:10.3/server
Start the
mariadb
service:# systemctl start mariadb.service
Enable the
mariadb
service to start at boot:# systemctl enable mariadb.service
Note that the MariaDB and MySQL database servers cannot be installed in parallel in Red Hat Enterprise Linux 8.0 due to conflicting RPM packages. Parallel installation of components is possible in Red Hat Software Collections for Red Hat Enterprise Linux 6 and Red Hat Enterprise Linux 7. In Red Hat Enterprise Linux 8, different versions of database servers can be used in containers.
8.2.2.1. Improving MariaDB installation security
To improve security when installing MariaDB, run the following command.
The command launches a fully interactive script, which prompts for each step in the process.
# mysql_secure_installation
The script enables to improve security in the following ways:
- Setting a password for root accounts
- Removing anonymous users
- Disallowing remote (outside the local host) root logins
8.2.3. Configuring MariaDB
8.2.3.1. Configuring the MariaDB server for networking
To configure the MariaDB server for networking, use the [mysqld]
section of the /etc/my.cnf.d/mariadb-server.cnf
file, where you can set the following configuration directives:
bind-address
Bind-address is the address on which the server will listen.
Possible options are: a host name, an IPv4 address, or an IPv6 address.
skip-networking
Possible values are:
0 - to listen for all clients
1 - to listen for local clients only
port
The port on which MariaDB listens for TCP/IP connections.
8.2.4. Backing up MariaDB data
There are two main ways to back up data from a MariaDB database:
- Logical backup
- Physical backup
Logical backup consists of the SQL statements necessary to restore the data. This type of backup exports information and records in plain text files.
The main advantage of logical backup over physical backup is portability and flexibility. The data can be restored on other hardware configurations, MariaDB versions or Database Management System (DBMS), which is not possible with physical backups.
Note that logical backup can be performed if the mariadb.service
is running. Logical backup does not include log and configuration files.
Physical backup consists of copies of files and directories that store the content.
Physical backup has the following advantages compared to logical backup:
- Output is more compact.
- Backup is smaller in size.
- Backup and restore are faster.
- Backup includes log and configuration files.
Note that physical backup must be performed when the mariadb.service
is not running or all tables in the database are locked to prevent changes during the backup.
You can use one of the following MariaDB backup approaches to back up data from a MariaDB database:
- Logical backup with mysqldump
- Physical online backup using the Mariabackup tool
- File system backup
- Replication as a backup solution
8.2.4.1. Performing logical backup with mysqldump
The mysqldump client is a backup utility, which can can be used to dump a database or a collection of databases for the purpose of a backup or transfer to another database server. The output of mysqldump typically consists of SQL statements to re-create the server table structure, populate it with data, or both. Alternatively, mysqldump can also generate files in other formats, including CSV or other delimited text formats, and XML.
To perform the mysqldump backup, you can use one of the following options:
- Back up a selected database
- Back up a subset of tables from one database
- Back up multiple databases
- Back up all databases
8.2.4.1.1. Backing up an entire database with mysqldump
Procedure
To back up an entire database, run:
# mysqldump [options] db_name > backup-file.sql
8.2.4.1.2. Using mysqldump to back up a set of tables from one database
Procedure
To back up a subset of tables from one database, add a list of the chosen tables at the end of the
mysqldump
command:# mysqldump [options] db_name [tbl_name …]
8.2.4.1.3. Using mysqldump to load the dump file back into a server
Procedure
To load the dump file back into a server, use either of these:
# mysql db_name < backup-file.sql
# mysql -e "source /path-to-backup/backup-file.sql" db_name
8.2.4.1.4. Using mysqldump to copy data between two databases
Procedure
To populate databases by copying data from one MariaDB server to another, run:
# mysqldump --opt db_name | mysql --host=remote_host -C db_name
8.2.4.1.5. Dumping multiple databases with mysqldump
Procedure
To dump multiple databases at once, run:
# mysqldump [options] --databases db_name1 [db_name2 …] > my_databases.sql
8.2.4.1.6. Dumping all databases with mysqldump
Procedure
To dump all databases, run:
# mysqldump [options] --all-databases > all_databases.sql
8.2.4.1.7. Reviewing mysqldump options
Procedure
To see a list of the options that mysqldump supports, run:
$ mysqldump --help
8.2.4.1.8. Additional resources
For more information on logical backup with mysqldump, see the MariaDB Documentation.
8.2.4.2. Performing physical online backup using the Mariabackup tool
Mariabackup is a tool based on the Percona XtraBackup technology, which enables performing physical online backups of InnoDB, Aria, and MyISAM tables.
Mariabackup, provided by the mariadb-backup
package from the AppStream repository, supports full backup capability for MariaDB server, which includes encrypted and compressed data.
Prerequisites
The
mariadb-backup
package is installed on the system:# yum install mariadb-backup
Mariabackup needs to be provided with credentials for the user by which the backup will be run. You can either provide the credentials on the command line, as shown in the procedure, or by a configuration file before applying the procedure. To set the credentials using the configuration file, first create the file (for example,
/etc/my.cnf.d/mariabackup.cnf
), and than add the following lines into the[xtrabackup]
or[mysqld]
section of the new file:[xtrabackup] user=myuser password=mypassword
ImportantMariabackup does not read options in the
[mariadb]
section of configuration files. If a non-default data directory is specified on a MariaDB server, you must specify this directory in the[xtrabackup]
or[mysqld]
sections of configuration files, so that Mariabackup is able to find the data directory.To specify such a data directory, include the following line in the
[xtrabackup]
or[mysqld]
sections of MariaDB configuration files:datadir=/var/mycustomdatadir
NoteUsers of Mariabackup must have the
RELOAD
,LOCK TABLES
, andREPLICATION CLIENT
privileges to be able to work with the backup.
To create a backup of a database using Mariabackup, use the following procedure:
Procedure
Run the following command:
$ mariabackup --backup --target-dir <backup_directory> --user <backup_user> --password <backup_passwd>
The
target-dir
option defines the directory where the backup files are stored. If you want to perform a full backup, the target directory must be empty or not exist.The
user
andpassword
options allow you to configure the user name and the password. Do not use these options if you already configured the user name and the password in the configuration file as described in prerequisites.
Additional resources
For more information on performing backups with Mariabackup, see Full Backup and Restore with Mariabackup.
8.2.4.3. Restoring data using the Mariabackup tool
When the backup is complete, you can restore the data from the backup by using the mariabackup
command with one of the following options:
-
--copy-back
-
--move-back
The --copy-back
option allows you to keep the original backup files. The --move-back
option moves the backup files to the data directory, and removes the original backup files.
Prerequisites
Make sure that the
mariadb
service is not running:# systemctl stop mariadb.service
- Make sure that the data directory is empty.
8.2.4.3.1. Restoring data with Mariabackup while keeping the backup files
To restore the data while keeping the original backup files, use the following procedure.
Procedure
Run the
mariabackup
command with the--copy-back
option:$ mariabackup --copy-back --target-dir=/var/mariadb/backup/
Fix the file permissions.
When restoring a database, Mariabackup preserves the file and directory privileges of the backup. However, Mariabackup writes the files to disk as the user and group restoring the database. Consequently, after restoring a backup, you may need to adjust the owner of the data directory to match the user and group for the MariaDB Server, typically
mysql
for both.For example, to recursively change ownership of the files to the
mysql
user and group:# chown -R mysql:mysql /var/lib/mysql/
Start the
mariadb
service:# systemctl start mariadb.service
8.2.4.3.2. Restoring data with Mariabackup while removing the backup files
To restore the data, and not keep the original backup files, use the following procedure.
Procedure
Run the
mariabackup
command with the--move-back
option:$ mariabackup --move-back --target-dir=/var/mariadb/backup/
Fix the file permissions.
When restoring a database, Mariabackup preserves the file and directory privileges of the backup. However, Mariabackup writes the files to disk as the user and group restoring the database. Consequently, after restoring a backup, you may need to adjust the owner of the data directory to match the user and group for the MariaDB Server, typically
mysql
for both.For example, to recursively change ownership of the files to the
mysql
user and group:# chown -R mysql:mysql /var/lib/mysql/
Start the
mariadb
service:# systemctl start mariadb.service
8.2.4.3.3. Additional resources
For more information see Full Backup and Restore with Mariabackup.
8.2.4.4. Performing file system backup
To create a file system backup of MariaDB data files, switch to the root
user, and copy the content of the MariaDB data directory to your backup location.
To back up also your current configuration or the log files, use the optional steps of the following procedure.
Procedure
Stop the
mariadb
service:# systemctl stop mariadb.service
Copy the data files to the required location:
# cp -r /var/lib/mysql /backup-location
Optionally, copy the configuration files to the required location:
# cp -r /etc/my.cnf /etc/my.cnf.d /backup-location/configuration
Optionally, copy the log files to the required location:
# cp /var/log/mariadb/* /backup-location/logs
Start the
mariadb
service:# systemctl start mariadb.service
8.2.4.5. Introduction to replication as a backup solution
Replication is an alternative backup solution for master servers. If a master server replicates to a slave server, backups can be run on the slave without any impact on the master. Master can still run while you shut down the slave, and back the data up from him.
Replication itself is not a sufficient backup solution. Replication protects master servers against hardware failures, but it does not ensure protection against data loss. It is recommended that you use any other backup solution on the replication slave together with this method.
Additional resources
For more information on replication as a backup solution, see MariaDB Documentation.
8.2.5. Migrating to MariaDB 10.3
Red Hat Enterprise Linux 7 contains MariaDB 5.5 as the default implementation of a server from the MySQL databases family. Later versions of the MariaDB database server are available as Software Collections for Red Hat Enterprise Linux 6 and Red Hat Enterprise Linux 7. Red Hat Enterprise Linux 8 provides MariaDB 10.3 and MySQL 8.0.
8.2.5.1. Notable differences between the RHEL 7 and RHEL 8 versions of MariaDB
The most important changes between MariaDB 5.5 and MariaDB 10.3 are as follows:
- MariaDB Galera Cluster, a synchronous multi-master cluster, is a standard part of MariaDB since 10.1.
- The ARCHIVE storage engine is no longer enabled by default, and the plug-in needs to be specifically enabled.
- The BLACKHOLE storage engine is no longer enabled by default, and the plug-in needs to be specifically enabled.
InnoDB is used as the default storage engine instead of XtraDB, which was used in MariaDB 10.1 and earlier versions.
For more details, see Why does MariaDB 10.2 use InnoDB instead of XtraDB?.
-
The new
mariadb-connector-c
packages provide a common client library for MySQL and MariaDB. This library is usable with any version of the MySQL and MariaDB database servers. As a result, the user is able to connect one build of an application to any of the MySQL and MariaDB servers distributed with Red Hat Enterprise Linux 8.
To migrate from MariaDB 5.5 to MariaDB 10.3, you need to perform multiple configuration changes as described in Section 8.2.5.2, “Configuration changes”.
8.2.5.2. Configuration changes
The recommended migration path from MariaDB 5.5 to MariaDB 10.3 is to upgrade to MariaDB 10.0 first, and then upgrade by one version successively.
The main advantage of upgrading one by one version is better adaptation of the database, including both data and configuration to the changes. The upgrade ends on the same major version as is available in RHEL 8 (MariaDB 10.3), which significantly reduces configuration changes or other issues.
For more information about configuration changes when migrating from MariaDB 5.5 to MariaDB 10.0, see Migrating to MariaDB 10.0 in Red Hat Software Collections documentation.
The migration to following successive versions of MariaDB and the required configuration changes is described in these documents:
- Migrating to MariaDB 10.1 in Red Hat Software Collections documentation.
- Migrating to MariaDB 10.2 in Red Hat Software Collections documentation.
- Migrating to MariaDB 10.3 in Red Hat Software Collections documentation.
Migration directly from MariaDB 5.5 to MariaDB 10.3 is also possible, but you must perform all configuration changes that are required by differences described in the migration documents above.
8.2.5.3. In-place upgrade using the mysql_upgrade tool
To migrate the database files to Red Hat Enterprise Linux 8, users of MariaDB on Red Hat Enterprise Linux 7 need to perform the in-place upgrade using the mysql_upgrade
tool. The mysql_upgrade
tool is provided by the mariadb-server-utils
subpackage, which is installed as a dependency of the mariadb-server
package.
To perform an in-place upgrade, you need to copy binary data files to the /var/lib/mysql/
data directory on the Red Hat Enterprise Linux 8 system and use the mysql_upgrade
tool.
You can use this method for migrating data from:
- The Red Hat Enterprise Linux 7 version of MariaDB 5.5
The Red Hat Software Collections versions of:
- MariaDB 5.5 (no longer supported)
- MariaDB 10.0 (no longer supported)
- MariaDB 10.1 (no longer supported)
- MariaDB 10.2
Note that it is recommended to upgrade to MariaDB 10.2 by one version successively. See the respective Migration chapters in the Release Notes for Red Hat Software Collections.
If you are upgrading from the Red Hat Enterprise Linux 7 version of MariaDB, the source data is stored in the /var/lib/mysql/
directory. In case of Red Hat Software Collections versions of MariaDB, the source data directory is /var/opt/rh/<collection_name>/lib/mysql/
(with the exception of the mariadb55
, which uses the /opt/rh/mariadb55/root/var/lib/mysql/
data directory).
Before performing the upgrade, back up all your data stored in the MariaDB databases.
To perform the in-place upgrade, change to the root
user, and use the following procedure:
Ensure that the
mariadb-server
package is installed on the Red Hat Enterprise Linux 8 system:# yum install mariadb-server
Ensure that the mariadb daemon is not running on either of the source and target systems at the time of copying data:
# systemctl stop mariadb.service
-
Copy the data from the source location to the
/var/lib/mysql/
directory on the Red Hat Enterprise Linux 8 target system. Set the appropriate permissions and SELinux context for copied files on the target system:
# restorecon -vr /var/lib/mysql
Start the MariaDB server on the target system:
# systemctl start mariadb.service
Run the
mysql_upgrade
command to check and repair internal tables:# mysql_upgrade
-
When the upgrade is complete, make sure that all configuration files within the
/etc/my.cnf.d/
directory include only valid options for MariaDB 10.3.
There are certain risks and known problems related to in-place upgrade. For example, some queries might not work or they will be run in different order than before the upgrade. For more information on these risks and problems, and for general information about in-place upgrade, see MariaDB 10.3 Release Notes.
8.2.6. Replicating MariaDB with Galera
This section describes how to replicate a MariaDB database using the Galera solution.
8.2.6.1. Introduction to MariaDB Galera Cluster
Galera replication is based on creation of a synchronous multi-master MariaDB Galera Cluster consisting of multiple MariaDB servers.
The interface between Galera replication and a MariaDB database is defined by the write set replication API (wsrep API).
The main features of MariaDB Galera Cluster are:
- Synchronous replication
- Active-active multi-master topology
- Read and write to any cluster node
- Automatic membership control, failed nodes drop from the cluster
- Automatic node joining
- Parallel replication on row level
- Direct client connections (Users can log on to the cluster nodes, and work with the nodes directly while the replication runs.)
Synchronous replication means that a server replicates a transaction at commit time by broadcasting the write set associated with the transaction to every node in the cluster. The client (user application) connects directly to the Database Management System (DBMS), and experiences behavior that is similar to native MariaDB.
Synchronous replication guarantees that a change that happened on one node in the cluster happens on other nodes in the cluster at the same time.
Therefore, synchronous replication has the following advantages over asynchronous replication:
- No delay in propagation of the changes between particular cluster nodes
- All cluster nodes are always consistent
- The latest changes are not lost if one of the cluster nodes crashes
- Transactions on all cluster nodes are executed in parallel
- Causality across the whole cluster
Additional resources
For more detailed information, see the upstream documentation:
8.2.6.2. Components to build MariaDB Galera Cluster
To be able to build MariaDB Galera Cluster, the following packages must be installed on your system:
-
mariadb-server-galera
-
mariadb-server
-
galera
The mariadb-server-galera
package contains support files and scripts for MariaDB Galera Cluster.
MariaDB upstream patches the mariadb-server
package to include the write set replication API (wsrep API). This API provides the interface between Galera replication and MariaDB.
MariaDB upstream also patches the galera
package to add full support for MariaDB. The galera
package contains the Galera Replication Library and the Galera Arbitrator tool. Galera Replication Library provides the whole replication functionality. Galera Arbitrator can be used as a cluster member that participates in voting in split-brain scenarios. However, Galera Arbitrator cannot participate in the actual replication.
Additional resources
For more detailed information, see upstream documentation:
8.2.6.3. Deploying MariaDB Galera Cluster
Prerequisites
All software necessary to build MariaDB Galera Cluster must be installed on the system. To ensure this, install the
galera
profile of themariadb:10.3
module:# yum module install mariadb:10.3/galera
As a result, the following packages are installed:
-
mariadb-server-galera
-
mariadb-server
galera
The
mariadb-server-galera
package pulls themariadb-server
andgalera
packages as its dependency.For more information on components to build MariaDB Galera Cluster, see Section 8.2.6.2, “Components to build MariaDB Galera Cluster”.
-
The MariaDB server replication configuration must be updated before the system is added to a cluster for the first time.
The default configuration is shipped in the
/etc/my.cnf.d/galera.cnf
file.Before deploying MariaDB Galera Cluster, set the
wsrep_cluster_address
option in the/etc/my.cnf.d/galera.cnf
file on all nodes to start with the following string:gcomm://
For the initial node, it is possible to set
wsrep_cluster_address
as an empty list:wsrep_cluster_address="gcomm://"
For all other nodes, set
wsrep_cluster_address
to include an address to any node which is already a part of the running cluster. For example:wsrep_cluster_address="gcomm://10.0.0.10"
For more information on how to set Galera Cluster address, see Galera Cluster Address.
Procedure
Bootstrap a first node of a new cluster by running the following wrapper on that node:
$ galera_new_cluster
This wrapper ensures that the MariaDB server daemon (
mysqld
) runs with the--wsrep-new-cluster
option. This option provides the information that there is no existing cluster to connect to. Therefore, the node creates a new UUID to identify the new cluster.NoteThe
mariadb
service supports a systemd method for interacting with multiple MariaDB server processes. Therefore, in cases with multiple running MariaDB servers, you can bootstrap a specific instance by specifying the instance name as a suffix:$ galera_new_cluster mariadb@node1
Connect other nodes to the cluster by running the following command on each of the nodes:
# systemctl start mariadb
As a result, the node connects to the cluster, and synchronizes itself with the state of the cluster.
Additional resources
For more information, see Getting started with MariaDB Galera Cluster.
8.2.6.4. Adding a new node to MariaDB Galera Cluster
To add a new node to MariaDB Galera Cluster, use the following procedure.
Note that you can use this procedure also to reconnect an already existing node.
Procedure
On the particular node, provide an address to one or more existing cluster members in the
wsrep_cluster_address
option within the[mariadb]
section of the/etc/my.cnf.d/galera.cnf
configuration file :[mariadb] wsrep_cluster_address="gcomm://192.168.0.1"
When a new node connects to one of the existing cluster nodes, it is able to see all nodes in the cluster.
However, preferably list all nodes of the cluster in
wsrep_cluster_address
.As a result, any node can join a cluster by connecting to any other cluster node, even if one or more cluster nodes are down. When all members agree on the membership, the cluster’s state is changed. If the new node’s state is different from that of the cluster, it requests either an Incremental State Transfer (IST) or a State Snapshot Transfer (SST) to make itself consistent with the other nodes.
Additional resources
- For more information, see Getting started with MariaDB Galera Cluster.
- For detailed information on State Snapshot Transfers (SSTs), see Introduction to State Snapshot Transfers.
8.2.6.5. Restarting MariaDB Galera Cluster
If you shut down all nodes at the same time, you terminate the cluster, and the running cluster no longer exists. However, the cluster’s data still exist.
To restart the cluster, bootstrap a first node as described in Section 8.2.6.3, “Deploying MariaDB Galera Cluster”.
If the cluster is not bootstrapped, and mysqld
on the first node is started just with the systemctl start mariadb
command, the node tries to connect to at least one of the nodes listed in the wsrep_cluster_address
option in the /etc/my.cnf.d/galera.cnf
file. If no nodes are currently running, the restart fails.
Additional resources
For more information, see Getting started with MariaDB Galera Cluster.
8.3. Using PostgreSQL
8.3.1. Getting started with PostgreSQL
The PostgreSQL server is an open source robust and highly-extensible database server based on the SQL language. It provides an object-relational database system, which allows to manage extensive datasets and a high number of concurrent users. For these reasons, the PostgreSQL servers can be used in clusters to manage high amounts of data.
The PostgreSQL server includes features for ensuring data integrity, building fault-tolerant environments or building applications. It allows to extend a database with user’s own data types, custom functions, or code from different programming languages without the need to recompile the database.
This section describes how to install PostgreSQL in Installing PostgreSQL or how to migrate to a different version of PostgreSQL in Migrating to a RHEL 8 version of PostgreSQL. One of the prerequisites of migration is performing a data backup.
8.3.2. Installing PostgreSQL
In RHEL 8, the PostgreSQL server is available in several versions, each provided by a separate stream:
- PostgreSQL 10 - the default stream
- PostgreSQL 9.6
- PostgreSQL 12 - available since RHEL 8.1.1
By design, it is impossible to install more than one version (stream) of the same module in parallel. Thus you need to choose only one of the available streams from the postgresql
module. Parallel installation of components is possible in Red Hat Software Collections for RHEL 7 and RHEL 6. In RHEL 8, different versions of database servers can be used in containers.
To install PostgreSQL:
Enable the stream (version) you wish to install:
# yum module enable postgresql:stream
Replace stream with the selected version of the PostgreSQL server.
You can omit this step if you want to use the default stream, which provides PostgreSQL 10.
Ensure that the
postgresql-server
package, available in the AppStream repository, is installed on the required server:# yum install postgresql-server
Initialize the data directory
postgresql-setup --initdb
Start the
postgresql
service:# systemctl start postgresql.service
Enable the
postgresql
service to start at boot:# systemctl enable postgresql.service
For information about using module streams, see Installing, managing, and removing user-space components.
If you want to upgrade from an earlier postgresql
stream within RHEL 8, follow both procedures described in Switching to a later stream and in Section 8.3.5, “Migrating to a RHEL 8 version of PostgreSQL”.
8.3.3. Configuring PostgreSQL
To change the PostgreSQL configuration, use the /var/lib/pgsql/data/postgresql.conf
file. Afterwards, restart the postgresql
service so that the changes become effective:
systemctl restart postgresql.service
Apart from /var/lib/pgsql/data/postgresql.conf
, other files to change PostgreSQL configuration exist:
-
postgresql.auto.conf
-
pg_ident.conf
-
pg_hba.conf
The postgresql.auto.conf
file holds basic PostgreSQL settings similarly to /var/lib/pgsql/data/postgresql.conf
. However, this file is under the server control. It is edited by the ALTER SYSTEM
queries, and cannot be edited manually.
The pg_ident.conf
file is used for mapping user identities from external authentication mechanisms into the postgresql user identities.
The pg_hba.conf
file is used for configuring detailed user permissions for PostgreSQL databases.
8.3.3.1. Initializing a database cluster
In a PostgreSQL database, all data is stored a single directory, which is called a database cluster. You can choose where to store your data but Red Hat recommends to store the data in the default /var/lib/pgsql/data
directory.
To initialize this data directory, run:
postgresql-setup --initdb
8.3.4. Backing up PostgreSQL data
To back up PostgreSQL data, use one of the following approaches:
- SQL dump
- File system level backup
- Ccontinuous archiving
8.3.4.1. Backing up PostgreSQL data with an SQL dump
8.3.4.1.1. Performing an SQL dump
The SQL dump method is based on generating a file with SQL commands. When this file is uploaded back to the database server, it recreates the database in the same state as it was at the time of the dump. The SQL dump is ensured by the pg_dump utility, which is a PostgreSQL client application. The basic usage of the pg_dump
command is such that the command writes its result into the standard output:
pg_dump dbname > dumpfile
The resulting SQL file can be either in a text format or in other different formats, which allows for parallelism and for more detailed control of object restoration.
You can perform the SQL dump from any remote host that has access to the database. The pg_dump utility does not operate with special permissions, but it must have a read access to all tables that you want to back up. To back up the entire database, you must run it as a database superuser.
To specify which database server pg_dump will contact, use the following command-line options:
The
-h
option to define the host.The default host is either the local host or what is specified by the
PGHOST
environment variable.The
-p
option to define the port.The default port is indicated by the
PGPORT
environment variable or the compiled-in default.
Note that pg_dump dumps only a single database. It does not dump information about roles or tablespaces because such information is cluster-wide.
To back up each database in a given cluster and to preserve cluster-wide data, such as role and tablespace definitions, use the pg_dumpall
command:
pg_dumpall > dumpfile
8.3.4.1.2. Restoring database from an SQL dump
To restore a database from an SQL dump:
Create a new database (dbname):
createdb
dbname
Make sure that all users who own objects or were granted permissions on objects in the dumped database already exist.
If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
Run the psql utility to restore a text file dump created by the pg_dump utility:
psql dbname < dumpfile
where dumpfile
is the output of the pg_dump
command.
If you want to restore a non-text file dump, use the pg_restore
utility:
pg_restore non-plain-text-file
8.3.4.1.2.1. Restoring a database on another server
Dumping a database directly from one server to another is possible because pg_dump and psql can write to and read from pipes.
To dump a database from one server to another, run:
pg_dump -h host1 dbname | psql -h host2 dbname
8.3.4.1.2.2. Handling SQL errors during restore
By default, psql continues to execute if an SQL error occurs. Consequently, the database is restored only partially.
If you want to change this default behavior, use one of the following approaches:
Make psql exit with an exit status of 3 if an SQL error occurs by setting the
ON_ERROR_STOP
variable:psql --set ON_ERROR_STOP=on dbname < dumpfile
Specify that the whole dump is restored as a single transaction so that the restore is either fully completed or canceled by using
psql
with one of the following options:psql -1
or
psql --single-transaction
Note that when using this approach, even a minor error can cancel a restore operation that has already run for many hours.
8.3.4.1.3. Advantages and disadvantages of an SQL dump
An SQL dump has the following advantages compared to other PostgreSQL backup methods:
- An SQL dump is the only PostgreSQL backup method that is not server version-specific. The output of the pg_dump utility can be reloaded into later versions of PostgreSQL, which is not possible for file system level backups or continuous archiving.
- An SQL dump is the only method that works when transferring a database to a different machine architecture, such as going from a 32-bit to a 64-bit server.
- An SQL dump provides internally consistent dumps. A dump represents a snapshot of the database at the time pg_dump began running.
- The pg_dump utility does not block other operations on the database when it is running.
A disadvantage of an SQL dump is that it takes more time compared to file system level backup.
8.3.4.1.4. Additional resources
For more information about the SQL dump, see PostgreSQL Documentation.
8.3.4.2. Backing up PostgreSQL data with a file system level backup
8.3.4.2.1. Performing a file system level backup
To perform a file system level backup, you need to copy the files that PostgreSQL uses to store the data in the database to another location:
- Choose the location of a database cluster and initialize this cluster as described in Section 8.3.3.1, “Initializing a database cluster”.
Stop the postgresql service:
# systemctl stop postgresql.service
Use any method to make a file system backup, for example:
tar -cf backup.tar /var/lib/pgsql/data
Start the postgresql service:
# systemctl start postgresql.service
8.3.4.2.2. Advantages and disadvantages of a file system level backup
A file system level backup has the following advantage compared to other PostgreSQL backup methods:
- File system level backup is usually faster than SQL dump.
File system level backup has the following disadvantages compared to other PostgreSQL backup methods:
- The backup is architecture-specific and Red Hat Enterprise Linux 7-specific. It can only be used as a backup to return to Red Hat Enterprise Linux 7 if the upgrade was not successful, but it cannot be used with PostgreSQL 10.0.
- The database server must be shut down before data backup and before data restore as well.
- Backup and restore of certain individual files or tables is impossible. The file system backups only work for a complete backup and restoration of an entire database cluster.
8.3.4.2.3. Alternative approaches to file system level backup
Alternative approaches to file system backup include:
- A consistent snapshot of the data directory
- The rsync utility
8.3.4.2.4. Additional resources
For more information about the file system level backup, see PostgreSQL Documentation.
8.3.4.3. Backing up PostgreSQL data by continuous archiving
8.3.4.3.1. Introduction to continuous archiving
PostgreSQL records every change made to the database’s data files into a write ahead log (WAL) file that is available in the pg_wal/
subdirectory of the cluster’s data directory. This log is intended primarily for a crash recovery. After a crash, the log entries made since the last checkpoint can be used for restoring the database to a consistency.
The continuous archiving method, also known as online backup
, combines the WAL files with a file system level backup. If a database recovery is needed, you can restore the database from the file system backup and then replay log from the backed up WAL files to bring the system to the current state.
For this backup method, you need a continuous sequence of archived WAL files that extends back to the start time of your backup at least.
If you want to start using the continuous archiving backup method, make sure to set up and test your procedure for archiving WAL files before taking your first base backup.
You cannot use pg_dump and pg_dumpall dumps as a part of a continuous archiving backup solution. These dumps produce logical backups, not file system level backups. As such, they do not contain enough information to be used by a WAL replay.
8.3.4.3.2. Performing continuous archiving backup
To perform a database backup and restore using the continuous archiving method, follow these steps:
8.3.4.3.2.1. Making a base backup
To perform a base backup, use the pg_basebackup tool, which can create a base backup in the form of either individual files or a tar
archive.
To use the base backup, you need to keep all the WAL segment files generated during and after the file system backup. The base backup process creates a backup history file that is stored into the WAL archive area and is named after the first WAL segment file that you need for the file system backup. When you have safely archived the file system backup and the WAL segment files used during the backup, which are specified in the backup history file, you can delete all archived WAL segments with names numerically less because they are no longer needed to recover the file system backup. However, consider keeping several backup sets to be certain that you can recover your data.
The backup history file is a small text file, which contains the label string that you gave to pg_basebackup, the starting and ending times, and WAL segments of the backup. If you used the label string to identify the associated dump file, then the archived history file is enough to tell you which dump file to restore.
With the continuous archiving method, you need to keep all the archived WAL files back to your last base backup. Thus the ideal frequency of base backups depends on:
- The storage volume available for archived WAL files.
The maximum possible duration of data recovery in situations when recovery is necessary.
In cases with long period since the last backup, the system replays more WAL segments, and the recovery thus takes more time.
For more information about making a base backup, see PostgreSQL Documentation.
8.3.4.3.2.2. Restoring the database using a continuous archive backup
To restore a database using a continuous backup:
Stop the server:
# systemctl stop postgresql.service
Copy the necessary data to a temporary location.
Preferably, copy the whole cluster data directory and any tablespaces. Note that this requires enough free space on your system to hold two copies of your existing database.
If you do not have enough space, save the contents of the cluster’s
pg_wal
directory, which can contain logs that were not archived before the system went down.- Remove all existing files and subdirectories under the cluster data directory and under the root directories of any tablespaces you are using.
Restore the database files from your file system backup.
Make sure that:
-
The files are restored with the correct ownership (the database system user, not
root
) - The files are restored with the correct permissions
-
The symbolic links in the
pg_tblspc/
subdirectory were restored correctly
-
The files are restored with the correct ownership (the database system user, not
Remove any files present in the
pg_wal/
subdirectoryThese files resulted from the file system backup and are therefore obsolete. If you did not archive
pg_wal/
, recreate it with proper permissions.-
Copy the unarchived WAL segment files that you saved in step 2 into
pg_wal/
if there are such files. -
Create the
recovery.conf
recovery command file in the cluster data directory. Start the server:
# systemctl start postgresql.service
The server will enter the recovery mode and proceed to read through the archived WAL files that it needs.
If the recovery is terminated due to an external error, the server can simply be restarted and it will continue the recovery. When the recovery process is completed, the server renames
recovery.conf
torecovery.done
to prevent accidental re-entering the recovery mode later, when the server starts normal database operations.Check the contents of the database to make sure that the database has recovered into the required state.
If the database has not recovered into the required state, return to step 1. If the database has recovered into the required state, allow the users to connect by restoring the
pg_hba.conf
file to normal.
For more information about restoring using the continuous backup, see PostgreSQL Documentation.
8.3.4.3.3. Advantages and disadvantages of continuous archiving
Continuous archiving has the following advantages compared to other PostgreSQL backup methods:
-
With the continuous backup method, it is possible to use a file system backup that is not entirely consistent because any internal inconsistency in the backup is corrected by the log replay. A file system snapshot is not needed;
tar
or a similar archiving tool is sufficient. - Continuous backup can be achieved by continuing to archive the WAL files because the sequence of WAL files for the log replay can be indefinitely long. This is particularly valuable for large databases.
- Continuous backup supports point-in-time recovery. It is not necessary to replay the WAL entries to the end. The replay can be stopped at any point and the database can be restored to its state at any time since the base backup was taken.
- If the series of WAL files are continuously available to another machine that has been loaded with the same base backup file, it is possible to restore the other machine with a nearly-current copy of the database at any point.
Continuous archiving has the following disadvantages compared to other PostgreSQL backup methods:
- Continuous backup method supports only restoration of an entire database cluster, not a subset.
- Continuous backup requires extensive archival storage.
8.3.4.3.4. Additional resources
For more information on continuous archiving method, see PostgreSQL Documentation.
8.3.5. Migrating to a RHEL 8 version of PostgreSQL
Red Hat Enterprise Linux 7 contains PostgreSQL 9.2 as the default version of the PostgreSQL server. In addition, several versions of PostgreSQL are provided as Software Collections for RHEL 7 and RHEL 6.
Red Hat Enterprise Linux 8 provides PostgreSQL 10 (the default postgresql
stream), PostgreSQL 9.6, and PostgreSQL 12.
Users of PostgreSQL on Red Hat Enterprise Linux can use two migration paths for the database files:
Use preferably the fast upgrade method, which is faster than the dump and restore process.
However, in certain cases, the fast upgrade does not work, and you can only use the dump and restore process. Such cases include:
- Cross-architecture upgrades
-
Systems using the
plpython
orplpython2
extensions. Note that RHEL 8 AppStream repository includes only thepostgresql-plpython3
package, not thepostgresql-plpython2
package. - Fast upgrade is not supported for migration from Red Hat Software Collections versions of PostgreSQL.
As a prerequisite for migration to a later version of PostgreSQL, back up all your PostgreSQL databases.
Dumping the databases and performing backup of the SQL files is a necessary part of the dump and restore process. However, you are recommended to do so if performing the fast upgrade as well.
Before migrating to a later version of PostgreSQL, see the upstream compatibility notes for the version of PostgreSQL to which you want to migrate, as well as for all skipped PostgreSQL versions between the one you are migrating from and the target version.
8.3.5.1. Fast upgrade using the pg_upgrade tool
During a fast upgrade, you need to copy binary data files to the /var/lib/pgsql/data/
directory and use the pg_upgrade
tool.
You can use this method for migrating data:
- From the RHEL 7 system version of PostgreSQL 9.2 to the RHEL 8 version of PostgreSQL 10
- From the RHEL 8 version of PostgreSQL 10 to the RHEL 8 version of PostgreSQL 12
If you want to upgrade from an earlier postgresql
stream within RHEL 8, follow the procedure described in Switching to a later stream and then migrate your PostgreSQL data.
For migrating between other combinations of PostgreSQL versions within RHEL, and for migration from the Red Hat Software Collections versions of PostgreSQL to RHEL, use Dump and restore upgrade.
Before performing the upgrade, back up all your data stored in the PostgreSQL databases.
By default, all data is stored in the /var/lib/pgsql/data/
directory on both the RHEL 7 and RHEL 8 systems.
The following procedure describes migration from the RHEL 7 system version of Postgreql 9.2 to a RHEL 8 version of PostgreSQL.
To perform a fast upgrade:
On the RHEL 8 system, enable the stream (version) to which you wish to migrate:
# yum module enable postgresql:stream
Replace stream with the selected version of the PostgreSQL server.
You can omit this step if you want to use the default stream, which provides PostgreSQL 10.
On the RHEL 8 system, install the
postgresql-server
andpostgresql-upgrade
packages:# yum install postgresql-server postgresql-upgrade
Optionally, if you used any PostgreSQL server modules on RHEL 7, install them also on the RHEL 8 system in two versions, compiled both against PostgreSQL 9.2 (installed as the
postgresql-upgrade
package) and the target version of PostgreSQL (installed as thepostgresql-server
package). If you need to compile a third-party PostgreSQL server module, build it both against thepostgresql-devel
andpostgresql-upgrade-devel
packages.Check the following items:
-
Basic configuration: On the RHEL 8 system, check whether your server uses the default
/var/lib/pgsql/data
directory and the database is correctly initialized and enabled. In addition, the data files must be stored in the same path as mentioned in the/usr/lib/systemd/system/postgresql.service
file. - PostgreSQL servers: Your system can run multiple PostgreSQL servers. Make sure that the data directories for all these servers are handled independently.
-
PostgreSQL server modules: Ensure that the PostgreSQL server modules that you used on RHEL 7 are installed on your RHEL 8 system as well. Note that plug-ins are installed in the
/usr/lib64/pgsql/
directory (or in the/usr/lib/pgsql/
directory on 32-bit systems).
-
Basic configuration: On the RHEL 8 system, check whether your server uses the default
Ensure that the
postgresql
service is not running on either of the source and target systems at the time of copying data.# systemctl stop postgresql.service
-
Copy the database files from the source location to the
/var/lib/pgsql/data/
directory on the RHEL 8 system. Perform the upgrade process by running the following command as the PostgreSQL user:
$ /bin/postgresql-setup --upgrade
This launches the
pg_upgrade
process in the background.In case of failure,
postgresql-setup
provides an informative error message.Copy the prior configuration from
/var/lib/pgsql/data-old
to the new cluster.Note that the fast upgrade does not reuse the prior configuration in the newer data stack and the configuration is generated from scratch. If you want to combine the old and new configurations manually, use the *.conf files in the data directories.
Start the new PostgreSQL server:
# systemctl start postgresql.service
Run the
analyze_new_cluster.sh
script located in the PostgreSQL home directory:su postgres -c '~/analyze_new_cluster.sh'
If you want the new PostgreSQL server to be automatically started on boot, run:
# systemctl enable postgresql.service
8.3.5.2. Dump and restore upgrade
When using the dump and restore upgrade, you need to dump all databases contents into an SQL file dump file.
Note that the dump and restore upgrade is slower than the fast upgrade method and it may require some manual fixing in the generated SQL file.
You can use this method for migrating data from:
- The Red Hat Enterprise Linux 7 system version of PostgreSQL 9.2
- Any earlier Red Hat Enterprise Linux 8 version of PostgreSQL
An earlier or equal version of PostgreSQL from Red Hat Software Collections:
- PostgreSQL 9.2 (no longer supported)
- PostgreSQL 9.4 (no longer supported)
- PostgreSQL 9.6
- PostgreSQL 10
- PostgreSQL 12
On Red Hat Enterprise Linux 7 and Red Hat Enterprise Linux 8 systems, PostgreSQL data is stored in the /var/lib/pgsql/data/
directory by default. In case of Red Hat Software Collections versions of PostgreSQL, the default data directory is /var/opt/rh/collection_name/lib/pgsql/data/
(with the exception of postgresql92
, which uses the /opt/rh/postgresql92/root/var/lib/pgsql/data/
directory).
If you want to upgrade from an earlier postgresql
stream within RHEL 8, follow the procedure described in Switching to a later stream and then migrate your PostgreSQL data.
To perform the dump and restore upgrade, change the user to root
.
The following procedure describes migration from the RHEL 7 system version of Postgreql 9.2 to a RHEL 8 version of PostgreSQL.
On your RHEL 7 system, start the PostgreSQL 9.2 server:
# systemctl start postgresql.service
On the RHEL 7 system, dump all databases contents into the
pgdump_file.sql
file:su - postgres -c "pg_dumpall > ~/pgdump_file.sql"
Make sure that the databases were dumped correctly:
su - postgres -c 'less "$HOME/pgdump_file.sql"'
As a result, the path to the dumped sql file is displayed:
/var/lib/pgsql/pgdump_file.sql
.On the RHEL 8 system, enable the stream (version) to which you wish to migrate:
# yum module enable postgresql:stream
Replace stream with the selected version of the PostgreSQL server.
You can omit this step if you want to use the default stream, which provides PostgreSQL 10.
On the RHEL 8 system, install the
postgresql-server
package:# yum install postgresql-server
Optionally, if you used any PostgreSQL server modules on RHEL 7, install them also on the RHEL 8 system. If you need to compile a third-party PostgreSQL server module, build it against the
postgresql-devel
package.On the RHEL 8 system, initialize the data directory for the new PostgreSQL server:
# postgresql-setup --initdb
On the RHEL 8 system, copy the
pgdump_file.sql
into the PostgreSQL home directory, and check that the file was copied correctly:su - postgres -c 'test -e "$HOME/pgdump_file.sql" && echo exists'
Copy the configuration files from the RHEL 7 system:
su - postgres -c 'ls -1 $PGDATA/*.conf'
The configuration files to be copied are:
-
/var/lib/pgsql/data/pg_hba.conf
-
/var/lib/pgsql/data/pg_ident.conf
-
/var/lib/pgsql/data/postgresql.conf
-
On the RHEL 8 system, start the new PostgreSQL server:
# systemctl start postgresql.service
On the RHEL 8 system, import data from the dumped sql file:
su - postgres -c 'psql -f ~/pgdump_file.sql postgres'
When upgrading from a Red Hat Software Collections version of PostgreSQL, adjust the commands to include scl enable collection_name.
For example, to dump data from the rh-postgresql96
Software Collection, use the following command:
su - postgres -c 'scl enable rh-postgresql96 "pg_dumpall > ~/pgdump_file.sql"'