RHEL 7 High Availability cluster nodes frequently getting fenced and lrmd reporting "error: crm_abort: lrmd_ipc_dispatch: Triggered assert at main.c:123 : flags & crm_ipc_client_response" and segfaulting

Solution Verified - Updated 2024-08-05T04:47:20+00:00 -

Issue

My 3 nodes are all rebooting in a loop and lrmd seems to be constantly segfaulting.
The nodes in my cluster won't stop fencing each other and I see lrmd reporting "Triggered assert at main.c:123 : flags & crm_ipc_client_response" and segfaulting

Jul  6 08:57:02 node1 lrmd[4164]: error: crm_abort: lrmd_ipc_dispatch: Triggered assert at main.c:123 : flags & crm_ipc_client_response
Jul  6 08:57:02 node1 lrmd[4164]: error: lrmd_ipc_dispatch: Invalid client request: 0x1219ce0

I see constant repeating errors from lrmd about notifications failing and crmd crashing after "crit: lrm_connection_destroy: LRM Connection failed"

Jul  6 08:57:12 node1 crmd[33886]: crit: lrm_connection_destroy: LRM Connection failed
Jul  6 08:57:12 node1 crmd[33886]: warning: do_update_resource: Resource pcmk-node1 no longer exists in the lrmd
Jul  6 08:57:12 node1 lrmd[4164]: warning: qb_ipcs_event_sendv: new_event_notification (4164-33886-8): Bad file descriptor (9)
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 crmd[33886]: notice: process_lrm_event: Operation pcmk-node1_stop_0: ok (node=pcmk-node1, call=2, rc=0, cib-update=0, confirmed=true)
Jul  6 08:57:12 node1 attrd[4166]: notice: attrd_peer_remove: Removing all pcmk-node1 attributes for pcmk-node1
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 crmd[33886]: error: do_log: FSA: Input I_ERROR from lrm_connection_destroy() received in state S_NOT_DC
Jul  6 08:57:12 node1 crmd[33886]: notice: do_state_transition: State transition S_NOT_DC -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=lrm_connection_destroy ]
Jul  6 08:57:12 node1 crmd[33886]: warning: do_recover: Fast-tracking shutdown in response to errors
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 crmd[33886]: error: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 attrd[4166]: notice: attrd_peer_remove: Removing all pcmk-slnec1ctl2 attributes for pcmk-slnec1ctl2
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 crmd[33886]: notice: do_lrm_control: Disconnected from the LRM
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 crmd[33886]: notice: terminate_cs_connection: Disconnecting from Corosync
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 crmd[33886]: error: crmd_fast_exit: Could not recover from internal error
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 lrmd[4164]: warning: send_client_notify: Notification of client crmd/8988b67f-ff65-4a39-a330-69efcbf12567 failed
Jul  6 08:57:12 node1 pacemakerd[4050]: error: pcmk_child_exit: The crmd process (33886) exited: Generic Pacemaker error (201)
Jul  6 08:57:12 node1 pacemakerd[4050]: notice: pcmk_process_exit: Respawning failed child process: crmd
Jul  6 08:57:12 node1 crmd[36596]: notice: crm_add_logfile: Additional logging available in /var/log/pacemaker.log

Environment

Red Hat Enterprise Linux (RHEL) 7 with the High Availability Add On
One or more stonith devices in the CIB has a name (ID) matching the name of one of the cluster nodes.
- The clusternode name comes either from corosync as specified in /etc/corosync/corosync.conf, or if the nodes are specified by IP address in that file, then the hostname (uname -n output) of the node is used as the name.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

RHEL 7 High Availability cluster nodes frequently getting fenced and lrmd reporting "error: crm_abort: lrmd_ipc_dispatch: Triggered assert at main.c:123 : flags & crm_ipc_client_response" and segfaulting

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links