RHEL 4 cluster node with temporary failure or unresponsiveness in lvm mirror leg or log results in blocked I/O indefinitely
Issue
-
Upon storage devices in a clustered mirror logical volume failing or becoming temporarily unresponsive, all write I/O to that mirror becomes blocked indefinitely, even after the failed devices have been restored.
-
On one node of a cluster with cmirror in use the access to some clvm devices get stuck for some time and the system shows high iowait.
-
dmesgshowsdm-cmirrorrecovery errors:
dm-cmirror: Recovery halted due to error on 1ikx7Vmy
dm-cmirror: LOG INFO:
dm-cmirror: uuid: LVM-MhSYycuOZUvfeEUI244gXVzNADGDFH6ewQVGmghnjgafjrGcGYsYAt6O1ikx7Vmy
dm-cmirror: uuid_ref : 1
dm-cmirror: log type : disk
dm-cmirror: ?region_count: 409600
dm-cmirror: ?sync_count : 409600
dm-cmirror: ?sync_search : 0
dm-cmirror: in_sync : YES
dm-cmirror: suspended : NO
dm-cmirror: recovery_halted : YES
dm-cmirror: server_id : 1
dm-cmirror: server_valid: YES
Environment
- Red Hat Enterprise Linux (RHEL) 4 Update 7 or earlier
- Red Hat Cluster Suite 4
- Clustered volume group(s) with mirrored logical volumes
cmirror-kernel[-variant]prior to release2.6.9-43.19.el4lvm2-clusterclvmdstartedlocking_type = 3in/etc/lvm/lvm.conf- One or more volume groups with the clustered attribute set
cmirror- One or more mirrored logical volume in clustered volume group)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.