Scrubbing a RAID Logical Volume

As of the Red Hat Enterprise Linux 6.5 release, LVM provides scrubbing support for RAID logical volumes. RAID scrubbing is the process of reading all the data and parity blocks in an array and checking to see whether they are coherent.
You initiate a RAID scrubbing operation with the --syncaction option of the lvchange command. You specify either a check or repair operation. A check operation goes over the array and records the number of discrepancies in the array but does not repair them. A repair operation corrects the discrepancies as it finds them.
The format of the command to scrub a RAID logical volume is as follows:
lvchange --syncaction {check|repair} vg/raid_lv


The lvchange --syncaction repair vg/raid_lv operation does not perform the same function as the lvconvert --repair vg/raid_lv operation. The lvchange --syncaction repair operation initiates a background synchronization operation on the array, while the lvconvert --repair operation is designed to repair/replace failed devices in a mirror or RAID logical volume.
In support of the new RAID scrubbing operation, the lvs command now supports two new printable fields: raid_sync_action and raid_mismatch_count. These fields are not printed by default. To display these fields you specify them with the -o parameter of the lvs, as follows.
lvs -o +raid_sync_action,raid_mismatch_count vg/lv
The raid_sync_action field displays the current synchronization operation that the raid volume is performing. It can be one of the following values:
  • idle: All sync operations complete (doing nothing)
  • resync: Initializing an array or recovering after a machine failure
  • recover: Replacing a device in the array
  • check: Looking for array inconsistencies
  • repair: Looking for and repairing inconsistencies
The raid_mismatch_count field displays the number of discrepancies found during a check operation.
The Cpy%Sync field of the lvs command now prints the progress of any of the raid_sync_action operations, including check and repair.
The lv_attr field of the lvs display now provides additional indicators in support of the RAID scrubbing operation. Bit 9 of this field displays the health of the logical volume, and it now supports the following indicators.
  • (m)ismatches indicates that there are discrepancies in a RAID logical volume. This character is shown after a scrubbing operation has detected that portions of the RAID are not coherent.
  • (r)efresh indicates that a device in a RAID array has suffered a failure and the kernel regards it as failed, even though LVM can read the device label and considers the device to be operational. The logical should be (r)efreshed to notify the kernel that the device is now available, or the device should be (r)eplaced if it is suspected of having failed.
For information on the lvs command, see Section 5.8.2, “Object Selection”.
When you perform a RAID scrubbing operation, the background I/O required by the sync operations can crowd out other I/O operations to LVM devices, such as updates to volume group metadata. This can cause the other LVM operations to slow down. You can control the rate at which the RAID logical volume is scrubbed by implementing recovery throttling.
You control the rate at which sync operations are performed by setting the minimum and maximum I/O rate for those operations with the --minrecoveryrate and --maxrecoveryrate options of the lvchange command. You specify these options as follows.
  • --maxrecoveryrate Rate[bBsSkKmMgG]
    Sets the maximum recovery rate for a RAID logical volume so that it will not crowd out nominal I/O operations. The Rate is specified as an amount per second for each device in the array. If no suffix is given, then kiB/sec/device is assumed. Setting the recovery rate to 0 means it will be unbounded.
  • --minrecoveryrate Rate[bBsSkKmMgG]
    Sets the minimum recovery rate for a RAID logical volume to ensure that I/O for sync operations achieves a minimum throughput, even when heavy nominal I/O is present. The Rate is specified as an amount per second for each device in the array. If no suffix is given, then kiB/sec/device is assumed.