rgmanager blocks or is unable to manage services when using Redundant Ring Protocol (RRP) in a RHEL 6 Update 3 or earlier High Availability cluster

Solution Verified - Updated -

Issue

  • When I start rgmanager it just hangs:
May 15 03:54:51 node1 kernel: INFO: task rgmanager:2420 blocked for more than 120 seconds.
May 15 03:54:51 node1 kernel:      Not tainted 2.6.32-431.3.1.el6.x86_64 #1
May 15 03:54:51 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 15 03:54:51 node1 kernel: rgmanager     D 0000000000000002     0  2420   2418 0x00000000
May 15 03:54:51 node1 kernel: ffff880336a93c98 0000000000000082 0000000000000000 ffff880336a93c5c
May 15 03:54:51 node1 kernel: ffff880300000000 ffff88033fc23480 ffff880028296840 0000000000000200
May 15 03:54:51 node1 kernel: ffff8803356425f8 ffff880336a93fd8 000000000000fbc8 ffff8803356425f8
May 15 03:54:51 node1 kernel: Call Trace:
May 15 03:54:51 node1 kernel: [<ffffffff815287c5>] schedule_timeout+0x215/0x2e0
May 15 03:54:51 node1 kernel: [<ffffffff81527920>] ? thread_return+0x4e/0x76e
May 15 03:54:51 node1 kernel: [<ffffffff81285392>] ? kobject_uevent_env+0x202/0x620
May 15 03:54:51 node1 kernel: [<ffffffff81528443>] wait_for_common+0x123/0x180
May 15 03:54:51 node1 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
May 15 03:54:51 node1 kernel: [<ffffffff8152855d>] wait_for_completion+0x1d/0x20
May 15 03:54:51 node1 kernel: [<ffffffffa022ef79>] dlm_new_lockspace+0x999/0xa30 [dlm]
May 15 03:54:51 node1 kernel: [<ffffffffa0236ff1>] device_write+0x311/0x720 [dlm]
May 15 03:54:51 node1 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
May 15 03:54:51 node1 kernel: [<ffffffff812263d6>] ? security_file_permission+0x16/0x20
May 15 03:54:51 node1 kernel: [<ffffffff81188f88>] vfs_write+0xb8/0x1a0
May 15 03:54:51 node1 kernel: [<ffffffff81189881>] sys_write+0x51/0x90
May 15 03:54:51 node1 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
  • I can't manage any services with clusvcadm or Conga

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
  • cman, clusterlib releases starting with 3.0.12.1-23.el6
  • rgmanager releases prior to 3.0.12.1-17.el6
    • See this solution for a similar issue on later releases of rgmanager
  • Cluster configured to use Redundant Ring Protocol (RRP)
    • <altname/> for each node in /etc/cluster/cluster.conf

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content