4.160. rgmanager

An updated rgmanager package that fixes one bug is now available for Red Hat Enterprise Linux 5.
The rgmanager package contains the Red Hat Resource Group Manager, which provides the ability to create and manage high-availability server applications in the event of system downtime.

Bug Fix

BZ#759542
Previously, rgmanager inappropriately called the rg_wait_threads() function during cluster reconfiguration. This could lead to an internal deadlock in rgmanager which caused the cluster services to become unresponsive. This irrelevant call has been removed from the code and deadlocks now no longer occur during cluster reconfiguration.
All users of rgmanager are advised to upgrade to this updated package, which fixes this bug.
An updated rgmanager package that fixes multiple bugs and adds one enhancement is now available for Red Hat Enterprise Linux 5.
The rgmanager package contains the Red Hat Resource Group Manager, which provides the ability to create and manage high-availability server applications in the event of system downtime.

Bug Fixes

BZ#690265
When running a Sybase database on a cluster, the cluster defines the ASEHA (Sybase Adaptive Server Enterprise with the High Availability Option) resource agents to manage the Sybase cluster resources. The ASEHAagent resource agent previously specified all resource attributes as unique. As a consequence, it was difficult to have more than one ASEHAagent resource present in the cluster because the Resource Group Manager ignores all resources with conflicting "unique" attributes. This update removes the unique flag from all unnecessary attributes so it is now possible to run multiple ASEHAagent resource agents on one cluster node.
BZ#700103
Previously, rgmanager did not handle wildcard characters matching in the nfsclient.sh script correctly. Therefore, rgmanager was unable to detect removal of an NFS export from the export table if there was another NFS export which matched the wildcard pattern. Consequently, rgmanger did not restart the appropriate NFS service as expected. This update corrects wildcard matching logic so that rgmanager now correctly recognizes removal of matched NFS exports and restarts the relevant NFS service.
BZ#713243
Previously, rgmanager inappropriately called the rg_wait_threads() function during cluster reconfiguration. This could lead to an internal deadlock in rgmanager, which caused the cluster services to become unresponsive. This incorrect call has been removed from the code and deadlocks now no longer occur during cluster reconfiguration.
BZ#722230
Resource Group Manager did not properly handle service status reporting in certain situations within a multi-node cluster with a restricted failover domain defined. Consequently, if a service failover failed because there was an exclusive service running on the only suitable standby node, rgmanager reported the failed service as started on an offline node. This update modifies Resource Group Manager's event handling so a failed service is now correctly reported as stopped in this scenario.
BZ#743442
Resource Group Manager did not handle inter-service dependencies correctly. Therefore, if a service was dependent on another service that was running on the same cluster node, the dependent service became unresponsive during the service failover and remained in the recovering state. With this update, rgmanager has been modified to check a service state during failover and stop the service if it is dependent on the service that is failing over. Resource Group Manager then tries to start this dependent service on other nodes as expected.
BZ#752486
A rare race condition could occur when rgmanager received a request to start a new resource group thread while another thread was exiting. This race condition could cause a Time of Check to Time of Use (TOC/TOU) bug, which under certain circumstances resulted in an attempt to access previously-freed memory. As a consequence, rgmanager terminated unexpectedly with a segmentation fault. To avoid the TOC/TOU problem, rgmanager now checks the status of the resource group thread before attempting to use the thread. This ensures that the thread is referred to correctly and Resource Group Manager thus no longer crashes in this scenario.
BZ#768146
Resource Group Manager fails to stop a resource if it is located on unmounted file system. As a result of this failure, rgmanager treated the resource as missing and marked the appropriate service as failed, which prevented the cluster from recovering the service. This update allows rgmanager to ignore this error if a resource has not been previously started with a service. The service can now be properly started on a different host.
BZ#743214
Under certain circumstances, a stopped event could be processed after a service and its dependent services had already been restarted. This forced the dependent services to be restarted erroneously. This update allows rgmanager to ignore the stopped events if dependent services have already been started and the services are no longer restarted unnecessarily.
BZ#769731
Due to changes in the behavior of the LVM commands, failed devices could not be removed from a volume group (VG) in the same way as previously. This resulted in an inability to relocate cluster services because the affected VG and logical volumes (LVs) could not be modified while the failed device was present in the VG. This update adds an additional command that is now needed in order to remove the failed physical volume from the VG. Services running on affected LVs can now be relocated correctly.
BZ#744283
When running multiple oracledb resource instances at the same time, several instances could attempt to write into a shared log file at the same moment. This caused all but one resource to fail and the log file to become corrupted. With this update, rgmanager now uses a unique log file per each oracledb resource instance.

Enhancement

BZ#747352
The SAPDatabase resource agent shipped with the Red Hat Enterprise Linux High Availability add-on was out of sync with the upstream version. This could cause Resource Group Manager to fail to manage SAP instances properly. This update applies multiple upstream patches, which provide several bug fixes and enhancements, including the following:
  • The scope of the internal rc variable has been corrected in several internal functions.
  • The Oracle recovery method has been changed from recover automatic database to end backup.
  • The process search pattern has been adjusted for DB2 version 9.5.
  • The Oracle listener service is now started only if some database processes have been found.
  • The eval command is no longer used to start a new process when unnecessary.
This updated SAPDatabase resource agent allows improved handling of SAP database instances in Red Hat cluster environment.
All users of rgmanager are advised to upgrade to this updated package, which fixes these bugs and adds this enhancement.