How can I prevent segfaults, general protection faults, 'Watchdog: Daemon died, rebooting...', and unexpected reboots from clurgmgrd or rgmanager in a RHEL 4, 5, or 6 cluster

Solution Verified - Updated -

Issue

  • Why do I see "Watchdog: Daemon died, rebooting..." on a cluster node?
  • cluster node unexpected reboot by clurgmgrd watchdog
  • A general protection fault occurs and watchdog message is printed which resulted in the cluster node rebooting:
Oct 30 18:30:04 node1 kernel: clurgmgrd[9805] general protection rip:3e9e2729ed rsp:43bc4b50 error:0
Oct 30 18:30:04 node1 clurgmgrd[10484]: <crit> Watchdog: Daemon died, rebooting...
  • An update to our /etc/cluster/cluster.conf file results in a segfault and reboot of a server.
Jun 20 10:46:21 node1 clurgmgrd[12354]: <notice> Reconfiguring
Jun 20 10:46:21 node1 kernel: clurgmgrd[11153]: segfault at 00000000000000c0 rip 0000003d5f842b59 rsp 00000000405f5a20
error 4
Jun 20 10:46:21 node1 clurgmgrd[12353]: <crit> Watchdog: Daemon died, rebooting...         
  • rgmanager segfaults
  • clurgmgrd crashes
  • rgmanager has a general protection fault

Environment

  • Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On
  • Red Hat Cluster Suite (RHCS) 4
  • rgmanager

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.