rgmanager service won't start on some cluster nodes
Issue
- rgmanager won't start on two cluster nodes.
- While problematic nodes are seen online in the cluster, rgmanager won't start correctly on them.
- Following command shows that clurgmgrd is running but there are no "start" log in /var/log/messages
$ service rgmanager start
clurgmgrd (pid 9265) is running...
- services cannot start correctly
clusvcadm -e service:nfsclient-STATION_C
Local machine trying to enable service:nfsclient-STATION_C...Could not connect to resource group manager
- Below backtrace are observed for rgmanager process :
INFO: task clurgmgrd:9875 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
clurgmgrd D ffff810001028e20 0 9875 9874 (NOTLB)
ffff813821b43da8 0000000000000082 ffff81383d600dc0 ffffffff8002cbe3
0000000000000000 0000000000000001 ffff81383fbfe080 ffff8101c4fe8080
00009ec586b27590 000000000002b8a2 ffff81383fbfe268 0000000422ad1178
Call Trace:
[<ffffffff8002cbe3>] mntput_no_expire+0x19/0x89
[<ffffffff8001398c>] filemap_nopage+0x193/0x360
[<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b
[<ffffffff80063c99>] .text.lock.mutex+0xf/0x14
[<ffffffff8863db68>] :dlm:dlm_new_lockspace+0x2c/0x860
[<ffffffff80022325>] __up_read+0x19/0x7f
[<ffffffff8006723e>] do_page_fault+0x4fe/0x874
[<ffffffff886448ae>] :dlm:device_write+0x438/0x5e5
[<ffffffff80016b68>] vfs_write+0xce/0x174
[<ffffffff80017435>] sys_write+0x45/0x6e
[<ffffffff8005d116>] system_call+0x7e/0x83
Environment
- Red Hat Enterprise Linux (RHEL) 5 U6
- Red Hat Cluster Suite (RHCS)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.