Cluster processes like rgmanager, gfs2_quotad, etc are reported as "blocked for more than 120 seconds" in /var/log/messages when using fence_kdump in a RHEL 6 or 7 High Availability cluster
Issue
- Why do I see hung tasks in
/var/log/messageswhen usingfence_kdump? - When
fence_kdumpis configured, it repeatedly times out and never falls back to the other configured fence device
Apr 4 13:33:14 node1 fenced[6379]: fencing node node2
Apr 4 13:33:14 node1 fence_kdump[27549]: waiting for message from '192.168.2.12'
Apr 4 13:35:14 node1 fence_kdump[27549]: timeout after 120 seconds
Apr 4 13:35:14 node1 fenced[6379]: fence node2 dev 0.0 agent fence_kdump result: error from agent
Apr 4 13:35:14 node1 fenced[6379]: fence node2 failed
Apr 4 13:35:17 node1 fenced[6379]: fencing node node2
- Cluster processes become blocked whenever a node must be fenced
Apr 4 13:35:54 node1 kernel: INFO: task gfs2_quotad:6826 blocked for more than 120 seconds.
Apr 4 13:35:54 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 4 13:35:54 node1 kernel: gfs2_quotad D 0000000000000005 0 6826 2 0x00000080
Apr 4 13:35:54 node1 kernel: ffff88031dee5a18 0000000000000046 ffff88033585c080 0000000000000002
Apr 4 13:35:54 node1 kernel: 0000000000000000 ffff88031dee59c0 ffffffff81090d8d 0000000000000078
Apr 4 13:35:54 node1 kernel: ffff88031dee1098 ffff88031dee5fd8 000000000000fb88 ffff88031dee1098
Apr 4 13:35:54 node1 kernel: Call Trace:
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
- Applicable for both
cmanandpacemaker/corosync-based clusters - One or more nodes configured to use
fence_kdumpas the agent for one of itsfencedevices
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
