Chapter 6. Using Galera
In a high-availability deployment, Red Hat OpenStack Platform uses the MariaDB Galera Cluster to manage database replication. As described in Section 4.3, “OpenStack Services Configured in Pacemaker”, Pacemaker runs the Galera service using a bundle set resource that manages the master/slave status.
You can use the pcs status command to check if the galera-bundle service is running, and on which controllers it runs:
Docker container set: galera-bundle [192.168.24.1:8787/rhosp12/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-controller-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-controller-1 galera-bundle-2 (ocf::heartbeat:galera): Master overcloud-controller-2
6.1. Hostname resolution
When troubleshooting the MariaDB Galera Cluster, you can verify the hostname resolution. By default, the director binds the Galera resource to a hostname rather than to an IP address [1]. Therefore, any problems that prevent hostname resolution, such as misconfigured or failed DNS, might prevent Pacemaker from properly managing the Galera resource.
After you eliminate any hostname resolution problems, you can check the integrity of the cluster itself. To do so, check the write-set replication status on the database of each controller node.
To access MySQL, use the password that was set up by the director during the overcloud deployment. To retrieve the MySQL root password, use the hiera command:
$ sudo hiera -c /etc/puppet/hiera.yaml "mysql::server::root_password" *<MYSQL-HIERA-PASSWORD>*
Write-set replication information is stored on each node’s MariaDB database. Each relevant variable uses the prefix wsrep_. Therefore, you can query this information directly through the database client:
$ sudo mysql -B --password="<MYSQL-HIERA-PASSWORD>" -e "SHOW GLOBAL STATUS LIKE 'wsrep_%';"
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| wsrep_protocol_version | 5 |
| wsrep_last_committed | 202 |
| ... | ... |
| wsrep_thread_count | 2 |
+------------------------+-------+To verify the health and integrity of the MariaDB Galera Cluster, check first whether the cluster is reporting the right number of nodes. Then, check each node if it:
- is part of the correct cluster
- can write to the cluster
- can receive queries and writes from the cluster
- is connected to others within the cluster
- is replicating write-sets to tables in the local database
6.2. Database Cluster Integrity
When investigating problems with the MariaDB Galera Cluster, you can check the integrity of the cluster itself. Verifying cluster integrity involves checking specific wsrep_ database variables on each Controller node. To check a database variable, run:
$ sudo mysql -B --password="<MYSQL-HIERA-PASSWORD>" -e "SHOW GLOBAL STATUS LIKE 'VARIABLE';"
Replace VARIABLE with the wsrep_ database variable you want to check. For example, to view the node’s cluster state UUID:
$ sudo mysql -B --password="<MYSQL-HIERA-PASSWORD>" -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_state_uuid';"
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_state_uuid | e2c9a15e-5485-11e0-0800-6bbb637e7211 |
+--------------------------+--------------------------------------+
The following table lists the different wsrep_ database variables that relate to cluster integrity.
Table 6.1. Database variables to check for cluster integrity
| Variable | Summary | Description |
|---|---|---|
|
| Cluster state UUID | The ID of the cluster to which the node belongs. All nodes must have an identical ID. A node with a different ID is not connected to the cluster. |
|
| Number of nodes in the cluster | You can check this on any single node. If the value is less than the actual number of nodes, then some nodes have either failed or lost connectivity. |
|
| Total number of cluster changes | Determines whether or not the cluster has been split into several components, also known as a partition. This is likely caused by a network failure. All nodes must have an identical value.
In case some nodes are reporting a different |
| wsrep_cluster_status | Primary component status |
Determines whether or not the node can still write to the cluster. If so, then the |
6.3. Database Cluster Node
If you can isolate a Galera cluster problem to a specific node, other wsrep_ database variables can provide clues on the specific problem. You can check these variables in a similar manner as a cluster check, which is described in Section 6.2, “Database Cluster Integrity”.
$ sudo mysql -B --password="<MYSQL-HIERA-PASSWORD>" -e "SHOW GLOBAL STATUS LIKE 'VARIABLE';"
Replace VARIABLE with any of the following values:
Table 6.2. Database variables to check for node integrity
| Variable | Summary | Description |
|---|---|---|
| wsrep_ready | Node ability to accept queries |
States whether the node can accept write-sets from the cluster. If so, then |
| wsrep_connected | Node network connectiviry |
States whether the node has network connectivity to other nodes. If so, then |
| wsrep_local_state_comment | Node state |
Summarizes the node state. If node can still write to the cluster (ie. if
If the node is part of a non-operational component, then |
A wsrep_connected variable with an ON value might also mean that the node is only connected to some nodes but not all nodes. For example, in cases of a cluster partition, the node might be part of a component that cannot write to the cluster. See Section 6.2, “Database Cluster Integrity” for details.
If wsrep_connected is OFF, then the node is not connected to any cluster components.
6.4. Database Replication Performance
If the cluster and its individual nodes are all healthy and stable, you can check replication throughput to benchmark performance. You do this by checking the wsrep_ database variables, similar to Section 6.3, “Database Cluster Node” and Section 6.2, “Database Cluster Integrity”.
$ sudo mysql -B --password="<MYSQL-HIERA-PASSWORD>" -e "SHOW STATUS LIKE 'VARIABLE';"
Replace VARIABLE with any of the following values:
Table 6.3. Database variables to check for cluster performance (replication throughput)
| Variable | Summary |
|---|---|
|
| Average size of the local received queue since last query |
|
| Average send queue length since the last time the variable was queried |
|
| The minimum and maximum sizes the local received queue since either variable was last queried |
|
| Fraction of time that the node paused due to Flow Control since the last time the variable was queried |
|
|
Average distance between the lowest and highest sequence number ( |
Each time any of these variables are queried, a FLUSH STATUS command resets its value. Benchmarking cluster replication involves querying these values multiple times to see variances. These variances can help you judge how much Flow Control is affecting the cluster’s performance.
Flow Control is a mechanism used by the cluster to manage replication. When the local received write-set queue exceeds a certain threshold, Flow Control pauses replication in order for the node to catch up. See Flow Control from the Galera Cluster site for more information.
Check the following variables for optimization of different values and benchmarks:
- wsrep_local_recv_queue_avg > 0.0
-
The node cannot apply write-sets as quickly as it receives them, which then triggers replication throttling. Check
wsrep_local_recv_queue_minandwsrep_local_recv_queue_maxfor a detailed look at this benchmark. - wsrep_local_send_queue_avg > 0.0
-
As the value of
wsrep_local_send_queue_avgrises, so does the likelihood of replication throttling and network throughput issues. This is especially true aswsrep_local_recv_queue_avgrises. - wsrep_flow_control_paused > 0.0
Flow Control paused the node. To determine how long the node was paused, multiply the
wsrep_flow_control_pausedvalue with the number of seconds between querying it. For example, ifwsrep_flow_control_paused = 0.50a minute after last checking it, then node replication was paused for 30 seconds. Ifwsrep_flow_control_paused = 1.0, then the node was paused the entire time since the last query.Ideally,
wsrep_flow_control_pausedshould be as close to0.0as possible.- wsrep_cert_deps_distance
-
In case of throttling and pausing, you can check the
wsrep_cert_deps_distancevariable to see how many write-sets (on average) can be applied in parallel. Then, checkwsrep_slave_threadsto see how many write-sets can actually be applied simultaneously. - wsrep_slave_threads
Configuring a higher
wsrep_slave_threadscan help mitigate throttling and pausing. For example, if thewsrep_cert_deps_distancevalue is20, then doublingwsrep_slave_threadsfrom2to4can also double the amount of write-sets that the node can apply. However,wsrep_slave_threadsshould be set only as high as the number of CPU cores in the node.If a problematic node already has an optimal
wsrep_slave_threadssetting, then consider excluding the node from the cluster as you investigate possible connectivity issues.
