Cluster processes like rgmanager, gfs2_quotad, etc are reported as "blocked for more than 120 seconds" in /var/log/messages when using fence_kdump in a RHEL 6 or 7 High Availability cluster

Solution Unverified - Updated -

Issue

  • Why do I see hung tasks in /var/log/messages when using fence_kdump?
  • When fence_kdump is configured, it repeatedly times out and never falls back to the other configured fence device
Apr  4 13:33:14 node1 fenced[6379]: fencing node node2
Apr  4 13:33:14 node1 fence_kdump[27549]: waiting for message from '192.168.2.12'
Apr  4 13:35:14 node1 fence_kdump[27549]: timeout after 120 seconds
Apr  4 13:35:14 node1 fenced[6379]: fence node2 dev 0.0 agent fence_kdump result: error from agent
Apr  4 13:35:14 node1 fenced[6379]: fence node2 failed
Apr  4 13:35:17 node1 fenced[6379]: fencing node node2
  • Cluster processes become blocked whenever a node must be fenced
Apr  4 13:35:54 node1 kernel: INFO: task gfs2_quotad:6826 blocked for more than 120 seconds.
Apr  4 13:35:54 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr  4 13:35:54 node1 kernel: gfs2_quotad   D 0000000000000005     0  6826      2 0x00000080
Apr  4 13:35:54 node1 kernel: ffff88031dee5a18 0000000000000046 ffff88033585c080 0000000000000002
Apr  4 13:35:54 node1 kernel: 0000000000000000 ffff88031dee59c0 ffffffff81090d8d 0000000000000078
Apr  4 13:35:54 node1 kernel: ffff88031dee1098 ffff88031dee5fd8 000000000000fb88 ffff88031dee1098
Apr  4 13:35:54 node1 kernel: Call Trace:

Environment

  • Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
  • Applicable for both cman and pacemaker/corosync-based clusters
  • One or more nodes configured to use fence_kdump as the agent for one of its fencedevices

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content