lrmd segfaults and the logs report "crm_ipc_read: Connection to lrmd failed" in a RHEL 6 High Availability cluster with pacemaker

Solution In Progress - Updated -

Issue

  • My services experienced a failover in my cluster after the logs indicate lrmd dumped core
  • lrmd is segfaulting after reporting a long string of garbage characters in the log
Oct 20 04:00:20 [22743] node1       lrmd:     info: log_finished:   finished - rsc:H‰\$ÐH‰l$ØH‰ûL‰d$àL‰l$èH‰ÕL‰t$ðL‰|$øHƒìxH…ÿI‰÷A‰ÌE‰ÆM‰Í„€ action:H‰\$èH‰l$ðH‰ûL‰d$øHƒìH…ÿ„± call_id:-1387311296 pid:51 exit-code:0 exec-time:0ms queue-time:-23040ms
Oct 20 04:00:20 [22743] node1       lrmd:    error: crm_xml_err:    XML Error: string is not in UTF-8
Oct 20 04:00:20 [22744] node1       crmd:    error: crm_ipc_read:   Connection to lrmd failed
Oct 20 04:00:20 [6855] node1 pacemakerd:    error: child_waitpid:   Managed process 22743 (lrmd) dumped core
Oct 20 04:00:20 [6855] node1 pacemakerd:   notice: pcmk_child_exit:     Child process lrmd terminated with signal 11 (pid=22743, core=1)
Oct 20 04:00:20 [22744] node1       crmd:    error: mainloop_gio_callback:  Connection to lrmd[0x184e540] closed (I/O condition=17)
Oct 20 04:00:20 [22744] node1       crmd:     info: lrmd_ipc_connection_destroy:    IPC connection destroyed
Oct 20 04:00:20 [22744] node1       crmd:     crit: lrm_connection_destroy:     LRM Connection failed
Oct 20 04:00:20 [22744] node1       crmd:    error: do_log:     FSA: Input I_ERROR from lrm_connection_destroy() received in state S_NOT_DC
Oct 20 04:00:20 [22744] node1       crmd:   notice: do_state_transition:    State transition S_NOT_DC -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=lrm_connection_destroy ]
Oct 20 04:00:20 [22744] node1       crmd:  warning: do_recover:     Fast-tracking shutdown in response to errors
Oct 20 04:00:20 [22744] node1       crmd:    error: do_log:     FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY
Oct 20 04:00:20 [22744] node1       crmd:     info: do_state_transition:    State transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE cause=C_FSA_INTERNAL origin=do_recover ]
Oct 20 04:00:20 [22744] node1       crmd:     info: do_shutdown:    Disconnecting STONITH...
Oct 20 04:00:20 [6855] node1 pacemakerd:   notice: pcmk_process_exit:   Respawning failed child process: lrmd
  • lrmd is crashing repeatedly with its backtrace showing it failing in stonith_dispatch_internal
  • I'm fraquently seeing in the logs "connection to lrmd failed"
Nov 03 06:00:13 [4805] node1       crmd:    error: crm_ipc_read:    Connection to lrmd failed
Nov 03 06:00:13 [4805] node1       crmd:    error: mainloop_gio_callback:   Connection to lrmd[0x965c40] closed (I/O condition=17)
Nov 03 06:00:13 [4786] node1 pacemakerd:    error: child_waitpid:   Managed process 4802 (lrmd) dumped core
Nov 03 06:00:13 [4805] node1       crmd:     info: lrmd_ipc_connection_destroy:     IPC connection destroyed
Nov 03 06:00:13 [4805] node1       crmd:     crit: lrm_connection_destroy:  LRM Connection failed
Nov 03 06:00:13 [4786] node1 pacemakerd:   notice: pcmk_child_exit:     Child process lrmd terminated with signal 11 (pid=4802, core=1)
Nov 03 06:00:13 [4805] node1       crmd:    error: do_log:  FSA: Input I_ERROR from lrm_connection_destroy() received in state S_TRANSITION_ENGINE
Nov 03 06:00:13 [4805] node1       crmd:  warning: do_state_transition:     State transition S_TRANSITION_ENGINE -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=lrm_connection_destroy ]
Nov 03 06:00:13 [4805] node1       crmd:  warning: do_recover:  Fast-tracking shutdown in response to errors
Nov 03 06:00:13 [4786] node1 pacemakerd:   notice: pcmk_process_exit:   Respawning failed child process: lrmd

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
  • pacemaker

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.