Chapter 10. Diagnosing and Correcting Problems in a Cluster
Clusters problems, by nature, can be difficult to troubleshoot. This is due to the increased complexity that a cluster of systems introduces as opposed to diagnosing issues on a single system. However, there are common issues that system administrators are more likely to encounter when deploying or administering a cluster. Understanding how to tackle those common issues can help make deploying and administering a cluster much easier.
This chapter provides information about some common cluster issues and how to troubleshoot them. Additional help can be found in our knowledge base and by contacting an authorized Red Hat support representative. If your issue is related to the GFS2 file system specifically, you can find information about troubleshooting common GFS2 issues in the Global File System 2 document.
10.1. Configuration Changes Do Not Take Effect
When you make changes to a cluster configuration, you must propagate those changes to every node in the cluster.
- When you configure a cluster using Conga, Conga propagates the changes automatically when you apply the changes.
- For information on propagating changes to cluster configuration with the
ccscommand, see Section 6.15, “Propagating the Configuration File to the Cluster Nodes”.
- For information on propagating changes to cluster configuration with command line tools, see Section 9.4, “Updating a Configuration”.
If you make any of the following configuration changes to your cluster, it is not necessary to restart the cluster after propagating those changes the changes to take effect.
- Deleting a node from the cluster configuration—except where the node count changes from greater than two nodes to two nodes.
- Adding a node to the cluster configuration—except where the node count changes from two nodes to greater than two nodes.
- Changing the logging settings.
- Adding, editing, or deleting HA services or VM components.
- Adding, editing, or deleting cluster resources.
- Adding, editing, or deleting failover domains.
- Changing any
If you make any other configuration changes to your cluster, however, you must restart the cluster to implement those changes. The following cluster configuration changes require a cluster restart to take effect:
- Adding or removing the
two_nodeoption from the cluster configuration file.
- Renaming the cluster.
- Adding, changing, or deleting heuristics for quorum disk, changing any quorum disk timers, or changing the quorum disk device. For these changes to take effect, a global restart of the
qdiskddaemon is required.
- Changing the
rgmanager. For this change to take effect, a global restart of
- Changing the multicast address.
- Switching the transport mode from UDP multicast to UDP unicast, or switching from UDP unicast to UDP multicast.
You can restart the cluster using Conga, the
ccscommand, or command line tools,
- For information on restarting a cluster with Conga, see Section 5.4, “Starting, Stopping, Restarting, and Deleting Clusters”.
- For information on restarting a cluster with the
ccscommand, see Section 7.2, “Starting and Stopping a Cluster”.
- For information on restarting a cluster with command line tools, see Section 9.1, “Starting and Stopping the Cluster Software”.