Stopping corosync causes corosync-notifyd to exit and restart, restarting corosync in the process

Solution Verified - Updated -

Issue

  • corosync is still running after executing pcs cluster stop.
  • The pcs cluster stop or systemctl stop corosync command causes corosync-notifyd.service to exit and restart, which causes corosync.service to start again in the process.
  • A node rejoins the cluster membership immediately after the cluster was stopped on that node.
# Node 1
Aug 28 07:34:20 fastvm-rhel-7-6-21 systemd: Stopping Corosync Cluster Engine...
Aug 28 07:34:20 fastvm-rhel-7-6-21 corosync[31264]: [MAIN  ] Node was shut down by a signal
Aug 28 07:34:20 fastvm-rhel-7-6-21 corosync: Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
Aug 28 07:34:20 fastvm-rhel-7-6-21 corosync[31264]: [SERV  ] Unloading all Corosync service engines.
...
Aug 28 07:34:20 fastvm-rhel-7-6-21 corosync[31264]: [MAIN  ] Corosync Cluster Engine exiting normally
Aug 28 07:34:20 fastvm-rhel-7-6-21 systemd: corosync-notifyd.service: main process exited, code=exited, status=1/FAILURE
Aug 28 07:34:20 fastvm-rhel-7-6-21 systemd: Unit corosync-notifyd.service entered failed state.
Aug 28 07:34:20 fastvm-rhel-7-6-21 systemd: corosync-notifyd.service failed.
Aug 28 07:34:20 fastvm-rhel-7-6-21 systemd: corosync-notifyd.service holdoff time over, scheduling restart.
Aug 28 07:34:20 fastvm-rhel-7-6-21 systemd: Stopped Corosync Dbus and snmp notifier.
Aug 28 07:34:21 fastvm-rhel-7-6-21 corosync: Waiting for corosync services to unload:.[  OK  ]
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: Stopped Corosync Cluster Engine.
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: Started Corosync Dbus and snmp notifier.
Aug 28 07:34:21 fastvm-rhel-7-6-21 notifyd[31314]: [error] Failed to initialize the cmap API. Error 2
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: corosync-notifyd.service: main process exited, code=exited, status=1/FAILURE
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: Unit corosync-notifyd.service entered failed state.
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: corosync-notifyd.service failed.
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: corosync-notifyd.service holdoff time over, scheduling restart.
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: Stopped Corosync Dbus and snmp notifier.
Aug 28 07:34:21 fastvm-rhel-7-6-21 systemd: Starting Corosync Cluster Engine...

# Node 2
Aug 27 22:34:20 fastvm-rhel-7-6-22 corosync[3767]: [TOTEM ] A new membership (192.168.22.22:240167) was formed. Members left: 1
Aug 27 22:34:20 fastvm-rhel-7-6-22 corosync[3767]: [CPG   ] downlist left_list: 1 received
Aug 27 22:34:20 fastvm-rhel-7-6-22 corosync[3767]: [QUORUM] Members[1]: 2
Aug 27 22:34:20 fastvm-rhel-7-6-22 corosync[3767]: [MAIN  ] Completed service synchronization, ready to provide service.
...
Aug 27 22:34:22 fastvm-rhel-7-6-22 corosync[3767]: [TOTEM ] A new membership (192.168.22.21:240172) was formed. Members joined: 1
Aug 27 22:34:22 fastvm-rhel-7-6-22 corosync[3767]: [CPG   ] downlist left_list: 0 received
Aug 27 22:34:22 fastvm-rhel-7-6-22 corosync[3767]: [CPG   ] downlist left_list: 0 received
Aug 27 22:34:22 fastvm-rhel-7-6-22 corosync[3767]: [QUORUM] Members[2]: 1 2
Aug 27 22:34:22 fastvm-rhel-7-6-22 corosync[3767]: [MAIN  ] Completed service synchronization, ready to provide service.

Environment

  • Red Hat Enterprise Linux 7 (with the High Availability Add-on)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content