SCSI 'Cluster reserved disks' are marked offlined when an abort is triggered.
Issue
- SCSI disks are marked/considered offline under the following circumstances:
1) We have clustered disks.
2) A cluster node reboots when another system 'owns' the disks with a SCSI reservation.
3) While scanning for disks on reboot, a SCSI abort happens, usually from a dropped frame. This is why it is intermittent (it can only happen here).
4) The SCSI mid-layer is hard-coded to do a Test-Unit-Ready (TUR) when recovering from an abort.
5) TUR is hard-coded to interpret a 'SCSI Reservation Error' as a failure.
Environment
- Red Hat Enterprise Linux 5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.