RHEV Host cannot access one of the Storage Domains attached to the Data Center
Issue
- Host cannot access one of the Storage Domains attached to the Data Center.
-
4 of 6 hosts became non-operational. The events showed:
Host cannot access one of the Storage Domains attached to the Data Center. Setting Host state to Non-Operational. - Two VMs were paused and were then manually shutdown. The 4 hosts were then rebooted. The activation of these hosts in the RHEV-M GUI took much longer than usual and at a certain moment one of the hosts became non-operational. Eventually this host became available again and then all hosts were up and working.
-
On two of the hosts the
var/log/messagesfile contained several instances of"hung task detection", all of them related tosanlock:Apr 16 16:33:09 Host-3 kernel: INFO: task sanlock:4362 blocked for more than 120 seconds. Apr 16 16:33:09 Host-3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 16 16:33:09 Host-3 kernel: sanlock D 000000000000000c 0 4362 1 0x00000080 Apr 16 16:33:09 Host-3 kernel: ffff8817c1e25eb8 0000000000000086 0000000000000000 ffff8818688cf378 Apr 16 16:33:09 Host-3 kernel: 000000000000000e ffffea00a71ca6b8 ffff8817c1e25e78 ffffffff8114ba24 Apr 16 16:33:09 Host-3 kernel: ffff8817c1dd1098 ffff8817c1e25fd8 000000000000fb88 ffff8817c1dd1098 Apr 16 16:33:09 Host-3 kernel: Call Trace: Apr 16 16:33:09 Host-3 kernel: [<ffffffff8114ba24>] ? free_pages_and_swap_cache+0xb4/0xe0 Apr 16 16:33:09 Host-3 kernel: [<ffffffff814ea3e3>] io_schedule+0x73/0xc0 Apr 16 16:33:09 Host-3 kernel: [<ffffffff811be032>] wait_for_all_aios+0xd2/0x110 Apr 16 16:33:09 Host-3 kernel: [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20 Apr 16 16:33:09 Host-3 kernel: [<ffffffff811befa7>] io_destroy+0x87/0xe0 Apr 16 16:33:09 Host-3 kernel: [<ffffffff811bf01b>] sys_io_destroy+0x1b/0x60 Apr 16 16:33:09 Host-3 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b -
These were preceded by NFS timeouts such as:
Apr 16 16:30:25 Host-3 kernel: nfs: server Host-5.x.x.x not responding, timed out
Environment
- Red Hat Enterprise Virtualization (RHEV) 3.1, 3.2
-
Red Hat Enterprise Linux (RHEL) 6.3 hosts, with:
- 2.6.32-279.19.1 kernel
- vdsm-4.9.6-44.3, vdsm-4.9.6-45.2
- libvirt-0.9.10-21.el6_3.7, libvirt-0.9.10-21.el6_3.8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
