Cluster node evicted by qdiskd in RHEL
Issue
- How can I diagnose why a cluster node was evicted by qdiskd?
- Is there a simple way to tell whether a node was evicted by QDisk due to a storage issue, network issue, or hard failure of a node?
- A node was evicted by the QDisk daemon
qdiskd: <notice> Node 4 evicted
- A node reports in its logs that a read or write system call from
qdiskd
has hung, and it is then evicted:
qdiskd[687]: <warning> qdiskd: read (system call) has hung for 5 seconds
qdiskd[687]: <warning> In 5 more seconds, we will be evicted
- Which does the cluster detect first, lost token (heartbeat) or lost contact with
qdiskd
(quorumd
)? - Node evictions from the cluster when
qdiskd
missed an update
Environment
- Red Hat Cluster Suite (RHCS) 4
- Red Hat Enterprise Linux 5 with High Availability or Resilient Storage Add-on
- Red Hat Enterprise Linux 6 with High Availability or Resilient Storage Add-on
- Cluster configuration utilizing a quorum device ("quorumd" element is used in
/etc/cluster/cluster.conf
)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.