Multiple nodes in the cluster panicked after hung task warnings

Solution Unverified - Updated -

Issue

  • Multiple servers in GFS2 cluster crashed
  • Several nodes panicked after fencing failed for several minutes.
  • Several nodes kernel panicked after reporting hung task warnings such as:
Jan  5 23:30:45 node1 kernel: INFO: task gfs2_quotad:25161 blocked for more than 120 seconds.
Jan  5 23:30:45 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan  5 23:30:45 node1 kernel: gfs2_quotad   D ffffffff80156347     0 25161   1291         25169 25160 (L-TLB)
Jan  5 23:30:45 node1 kernel:  ffff81105c2cdcc0 0000000000000046 0000000000000000 ffff810838eb3000
Jan  5 23:30:45 node1 kernel:  0000000000000018 000000000000000a ffff81102eefb7e0 ffff81103f88c040
Jan  5 23:30:45 node1 kernel:  00040df462aee4fd 0000000000007eb8 ffff81102eefb9c8 00000028889bb97c
Jan  5 23:30:45 node1 kernel: Call Trace:
Jan  5 23:30:45 node1 kernel:  [<ffffffff889ba00f>] :dlm:dlm_lock+0x117/0x129
[...]

Environment

  • Red Hat Enterprise Linux (RHEL) 5 (Update 5 or later)
  • Red Hat Enterprise Linux (RHEL) 6, 7, 8, 9
  • High Availability Add On
  • sysctl parameter kernel.hung_task_panic = 1 (/proc/sys/kernel/hung_task_panic contains value 1)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content