Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

7.30. corosync

Updated corosync packages that fix several bugs and add multiple enhancements are now available for Red Hat Enterprise Linux 6.
The corosync packages provide the Corosync Cluster Engine and C Application Programming Interfaces (APIs) for Red Hat Enterprise Linux cluster software.

Bug Fixes

BZ#783068
Prior to this update, the corosync-notifyd service did not run after restarting the process. This update modifies the init script to wait for the actual exit of previously running instances of the process. Now, the corosync-notifyd service runs as expected after restarting.
BZ#786735
Prior to this update, an incorrect node ID was sent in recovery messages when corosync entered recovery. As a consequence, debugging problems in the source code was difficult. This update sets the correct node ID.
BZ#786737
Upon receiving the JoinMSG message in the OPERATIONAL state, a node enters the GATHER state. However, if JoinMSG was discarded, the nodes sending this JoinMSG could not receive a response until other nodes have had their tokens expired. This caused the nodes having entered the GATHER state spend more time to rejoin the ring. With this update, the underlying source code has been modified to address this issue.
BZ#787789
Prior to this update the netfilter firewall blocked input and output multicast packets, corosync coould become suspended, failed to create membership and cluster could not be used. After this update, corosync is no longer dependent on multicast loop kernel feature for local messages delivery, but uses the socpair unix dgram socket.
BZ#794744
Previously, on InfiniBand devices, corosync autogenerated the node ID when the configuration file or the cluster manager (cman) already set one. This update modifies the underlying code to recognize user-set mode IDs. Now, corosync autogenerates node IDs only when the user has not entered one.
BZ#821352
Prior to this update, corosync sockets were bound to a PEERs IP address instead of the local IP address when the IP address was configured as peer-to-peer (netmask /32). As a consequence, corosync was unable to create memberships. This update modifies the underlying code to use the correct information about the local IP address.
BZ#824902
Prior to this update, the corosync logic always used the first IP address that was found. As a consequence, users could not use more than one IP address on the same network. This update modifies the logic to use the first network address if no exact match was found. Now, users can bind to the IP address they select.
BZ#827100
Prior to this update, some sockets were not bound to a concrete IP address but listened on all interfaces in the UDPU mode. As a consequence, users could encounter problems when configuring the firewall. This update binds all sockets correctly.
BZ#847232
Prior to this update, configuration file names that consisted of more than 255 characters could cause corosync to abort unexpectedly. This update returns the complete item value. In case of the old ABI, corosync prints an error. Now, corosync no longer aborts with longer names.
BZ#838524
When corosync was running with the votequorum library enabled, votequorum's register reloaded the configuration handler after each change in the configuration database (confdb). This caused corosync to run slower and to eventually encounter an Out Of Memory error. After this update, a register callback is only performed during startup. As a result, corosync no longer slows down or encounters an Out Of Memory error.
BZ#848210
Prior to this update, the corosync-notifyd output was considerably slow and corosync memory grew when D-Bus output was enabled. Memory was not freed when corosync-notifyd was closed. This update modifies the corosync-notifyd event handler not to wait when there is nothing to receive and send from or to D-Bus. Now, corosync frees memory when the IPC client exits and corosync-notifyd produces output in speed of incoming events.
BZ#830799
Previously, the node cluster did not correspond with the CPG library membership. Consequently, the nodes were recognized as unknown, and corosync warning messages were not returned. A patch with an enhanced log from CPG has been provided to fix this bug. Now, the nodes work with CPG correctly, and appropriate warning messages are returned.
BZ#902397
Due to a regression, the corosync utility did not work with IPv6, which caused the network interface to be down. A patch has been provided to fix this bug. Corosync now works with IPv6 as expected, and the network interface is up.
BZ#838524
When corosync was running with the votequorum library enabled, votequorum's register reloaded the configuration handler after each change in the configuration database (confdb). This caused corosync to run slower and to eventually encounter an Out Of Memory error. After this update, a register callback is only performed during startup. As a result, corosync no longer slows down or encounters an Out Of Memory error.
BZ#865039
Previously, during heavy cluster operations, one of the nodes failed sending numerous of the following messages to the syslog file:
dlm_controld[32123]: cpg_dispatch error 2
A patch has been applied to address this issue.
BZ#850757
Prior to this update, corosync dropped ORF tokens together with memb_join packets when using CPU timing on certain networks. As a consequence, the RRP interface could be wrongly marked as faulty. This update drops only memb_join messages.
BZ#861032
Prior to this update, the corosync.conf parser failed if the ring number was larger than the allowed maximum of 1. As a consequence, corosync could abort with a segmentation fault. This update adds a check to the corosync.conf parser. Now, an error message is printed if the ring number is larger than 1.
BZ#863940
Prior to this update, corosync stopped on multiple nodes. As a consequence, corosync could, under certain circumstances, abort with a segmentation fault. This update ensures that the corosync service no longer calls callbacks on unloaded services.
BZ#869609
Prior to this update, corosync could abort with a segmentation fault when a large number of corosync nodes were started together. This update modifies the underlying code to ensure that the NULL pointer is not dereferenced. Now, corosync no longer encounters segmentation faults when starting multiple nodes at the same time.
BZ#876908
Prior to this update, the parsercorosync-objctl command with additional parameters could cause the error "Error reloading DB 11". This update removes the reloading function and handles changes of changed objects in the configuration data base (confdb). Now, the logging level can be changed as expected.
BZ#873059
Several typos in the corosync(8) manual page have been fixed. Also, manual pages for confdb_* functions have been added.

Enhancements

BZ#770455
With this update, the corosync log includes the hostname and the process ID of the processes that join the cluster to allow for better troubleshooting.
BZ#794522
This update adds the manual page confdb_keys.8 to provide descriptions for corosync runtime statistics that are returned by corosync-objctl.
BZ#838743
This update adds the new trace level to filter corosync flow messages to improve debugging.
Users of corosync are advised to upgrade to these updated packages, which fix these bugs and add these enhancements.
An updated corosync package that fixes several bugs is now available for Red Hat Enterprise Linux 6.
The Corosync packages provide the Corosync Cluster Engine and C Application Programming Interfaces (APIs) for Red Hat Enterprise Linux cluster software.

Bug Fix

BZ#929101
When running applications which used the Corosync IPC library, some messages in the dispatch() function were lost or duplicated. This update properly checks the return values of the dispatch_put() function, returns the correct remaining bytes in the IPC ring buffer, and ensures that the IPC client is correctly informed about the real number of messages in the ring buffer. Now, messages in the dispatch() function are no longer lost or duplicated.
Users of corosync are advised to upgrade to these updated packages, which fix this bug.