lrmd reporting "could not establish connection with attrd" following an attrd crash and respawn in a RHEL 6 High Availability cluster with pacemaker

Solution In Progress - Updated -

Issue

  • attrd daemon is failing.
  • lrmd could not establish connection with attrd
  • We're seeing repeated crashes from attrd every time we start pacemaker after updating the package on one node
  • attrd seems to be crashing, pacemakerd is respawning it, and then shortly after we're getting errors about "could not establish connection with attrd" and SysInfo resources are timing out on start or status
Mar 23 08:58:39 node1 kernel: attrd[2911]: segfault at 0 ip 000000325bc13394 sp 00007fffeb51fd70 error 4 in libcrmcommon.so.3.2.0[325bc00000+4e000]
Mar 23 08:58:39 node1 abrtd: Directory 'ccpp-2015-03-23-08:58:39-2911' creation detected
Mar 23 08:58:39 node1 abrt[13484]: Saved core dump of pid 2911 (/usr/libexec/pacemaker/attrd) to /var/spool/abrt/ccpp-2015-03-23-08:58:39-2911 (6914048 bytes)
Mar 23 08:58:39 node1 pacemakerd[2900]:    error: child_waitpid: Managed process 2911 (attrd) dumped core
Mar 23 08:58:39 node1 pacemakerd[2900]:   notice: pcmk_child_exit: Child process attrd terminated with signal 11 (pid=2911, core=1)
Mar 23 08:58:39 node1 pacemakerd[2900]:   notice: pcmk_process_exit: Respawning failed child process: attrd
  • attrd fails to coordinate update with CIB
  • attrd segfaulted and then afterwards lrmd started reporting could not establish connection
Mar 12 08:39:05 node1 lrmd[8363]:   notice: operation_finished: re-sysinfo_start_0:2748821:stderr [ Could not establish attrd connection: Resource temporarily unavailable (11) ]
Mar 12 08:39:05 node1 crmd[8366]:   notice: process_lrm_event: node1-re-sysinfo_start_0:9840 [ Could not establish attrd connection: Resource temporarily unavailable (11)\nCould not establish attrd connection: Resource temporarily unavailable (11)\nCould not establish attrd connection: Resource temporarily unavailable (11)\nCould not establish attrd connection: Resource temporarily unavailable (11)\nCould not establish attrd connection: Resource temporarily unavailable (11)\nCould not update arch=x86_64: Transport endpoint is not c
Mar 12 08:39:25 node1 lrmd[8363]:   notice: operation_finished: re-sysinfo_start_0:2749481:stderr [ Could not establish attrd connection: Resource temporarily unavailable (11) ]

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
  • One or more nodes running pacemaker-1.1.12 and one or more running 1.1.10 in the same cluster

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.