RHEL 5.4 cluster node logs "qdiskd: read (system call) has hung for X seconds" when I/O has not actually hung
Issue
-
After upgrading our cluster to RHEL 5.4 we are seeing significant number of the following in /var/log/messages:
Sep 14 06:49:30 localhost qdiskd[687]: <warning> qdiskd: read (system call) has hung for 105 seconds Sep 14 06:49:30 localhost qdiskd[687]: <warning> In 105 more seconds, we will be evicted
-
We continue to see these messages in /var/log/debug:
Sep 13 04:27:32 localhost qdiskd[5843]: <debug> Node 3 missed an update (2/70) Sep 13 04:28:41 localhost qdiskd[5843]: <debug> Node 3 missed an update (2/70)
-
Is it normal for a cluster to have this many qdisk warnings in a cluster?
Environment
- Red Hat Enterprise Linux (RHEL) 5 with the High Availability Add On
- Configuration utilizing a quorum disk (
<quorumd>
in/etc/cluster/cluster.conf
) cman
prior to release 2.0.115-34.el5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.