Cluster service with fs resource not detecting storage failure for several minutes in RHEL 6
Issue
- When I fail all paths to one of my multipath devices for which there is an
fsresource in a running cluster service, it takes almost 3 minutes before it picks up the failure
Aug 13 11:57:49 node1 kernel: connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4296127569, last ping 4296132569, now 4296137569
Aug 13 11:57:49 node1 kernel: connection3:0: detected conn error (1011)
Aug 13 11:57:50 node1 iscsid: Kernel reported iSCSI connection 3:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
Aug 13 11:57:54 node1 kernel: session2: session recovery timed out after 5 secs
Aug 13 11:57:54 node1 kernel: sd 9:0:0:0: [sdd] Unhandled error code
Aug 13 11:57:54 node1 kernel: sd 9:0:0:0: [sdd] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
Aug 13 11:57:54 node1 kernel: sd 9:0:0:0: [sdd] Unhandled error code
[...]
Aug 13 12:00:27 node1 kernel: Aborting journal on device dm-2-8.
Aug 13 12:00:27 node1 kernel: Buffer I/O error on device dm-2, logical block 2012774400
Aug 13 12:00:27 node1 kernel: lost page write due to I/O error on dm-2
Aug 13 12:00:27 node1 kernel: JBD2: I/O error detected when updating journal superblock for dm-2-8.
Aug 13 12:00:32 node1 kernel: EXT4-fs error (device dm-2): ext4_journal_start_sb: Detected aborted journal
Aug 13 12:00:32 node1 kernel: EXT4-fs (dm-2): Remounting filesystem read-only
Aug 13 12:00:32 node1 rgmanager[29042]: [fs] fs:homefs: is_alive: failed write test on [/nfshome]. Return code: 1
Aug 13 12:00:32 node1 rgmanager[29064]: [fs] fs:homefs: Mount point is not accessible!
- Trying to figure out ifwhat can be done in the cluster.conf file to reduce the amount of time it takes the network to detect that the storage has failed; currently takes 150 seconds
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
rgmanagerorpacemaker- One or more cluster services using an
fsresource
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
