Node evicted after qdiskd on node 1 reports hung read() on Red Hat Enterprise Linux 6
Issue
- Cluster crashed unexpectedly. Note was evicted by qdiskd.
Environment
- Red Hat Enterprise Linux Server 6 (with the High Availability or Resilient Storage Add Ons)
-
Red Hat High Availability Cluster with 2 nodes and a quorum disk
- Quorum disk reports a period of long cycles leading up to node eviction.
- Some storage errors occur at this period.
- Multiple clusters are affected by this issue at the same time.
- HP OPEN-V SAN, configured with device-mapper-multipath:
# cat /proc/scsi/scsi
...
Host: scsi3 Channel: 00 Id: 00 Lun: 00
Vendor: HP Model: OPEN-V Rev: 6008
Type: Direct-Access ANSI SCSI revision: 03
# multipath -ll
...
mpathe (360060e8005bf80000000bf8000003d0f) dm-0 HP,OPEN-V
size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 3:0:0:3 sde 8:64 active ready running
`- 4:0:0:3 sdi 8:128 active ready running
...
-
Observed on cman version cman-3.0.12.1-49.el6.x86_64
- It is currently unknown if other versions are affected.
- The symptoms may be different if used with a version of cman earlier than cman-3.0.12.1-32.el6 on RHEL6.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.