Why is rgmanager locked, the process blocks a cpu core and 'dlm: connect from non cluster node' appears in logs?
Issue
- process table shows the process rgmanager (RHEL6) or clurmgrd (RHEL5) in state D (uninterruptible sleep) on one or more cluster nodes and
clustat
not reporting status ofrgmanager
. There are messages logged:kernel: dlm: connect from non cluster node
. In addition there are rgmanager processes that are blocked:
kernel: dlm: Using TCP for communications
kernel: dlm: connect from non cluster node
kernel: dlm: connecting to 1
kernel: INFO: task rgmanager:14872 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: rgmanager D 0000000000000001 0 14872 14870 0x00000000
kernel: ffff880037e7fc88 0000000000000082 0000000000000001 ffff880002115f80
kernel: ffff880037e7fc18 ffffffff8105fe81 ffff880037ce8b78 ffff880002115fe8
kernel: ffff880037ce90f8 ffff880037e7ffd8 000000000000f598 ffff880037ce90f8
kernel: Call Trace:
kernel: [<ffffffff8105fe81>] ? dequeue_entity+0x1a1/0x1e0
kernel: [<ffffffff814db955>] schedule_timeout+0x215/0x2e0
kernel: [<ffffffff814dac27>] ? thread_return+0x4e/0x777
kernel: [<ffffffff81265e0f>] ? kobject_uevent_env+0x20f/0x660
kernel: [<ffffffff814db5d3>] wait_for_common+0x123/0x180
Environment
- Red Hat Enterprise Linux Server 5 (with the High Availability Add Ons)
- Red Hat Enterprise Linux Server 6 (with the High Availability Add Ons)
- Cluster Nodes are virtual machines.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.