Cluster fails to unfence a node or start scsi_reserve when using shared devices with partitions in a RHEL High Availability cluster with fence_scsi
Issue
- When starting
cman
, the service fails at "Unfencing self", and the logs showfence_node: unfence failed
Jul 28 12:09:11 node1 fence_node[31482]: unfence node1.example.com failed
- When attempting to start
scsi_reserve
on a cluster node, I see errors
Active clustered Logical Volumes: /dev/vg_ha/lvol1 /dev/vg_gfs/lvol1 /dev/vg_gfs/lvol2 /dev/vg_gfs/lvol3
persistent reservation in: pass through os error: Inappropriate ioctl for device
PR in: command failed
persistent reservation in: pass through os error: Inappropriate ioctl for device
PR in: command failed
No registered devices found.
- I get reservation errors in
/var/log/messages
when starting a cluster
May 3 13:19:37 node1 scsi_reserve: [info] registered with device /dev/sdc (key=0x1a740001)
May 3 13:19:37 node1 scsi_reserve: [info] registered with device /dev/sdd (key=0x1a740001)
May 3 13:19:37 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-10 (key=0x1a740001)
May 3 13:19:37 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-12 (key=0x1a740001)
May 3 13:19:37 node1 scsi_reserve: [info] 2 errors during registration
May 3 13:19:37 node1 scsi_reserve: [info] leaving the fence domain
May 3 13:21:25 node1 scsi_reserve: [info] unable to remove registration on /dev/sdc (key=0x1a740001)
May 3 13:21:25 node1 scsi_reserve: [info] unable to remove registration on /dev/sdd (key=0x1a740001)
May 3 13:21:31 node1 scsi_reserve: [info] registered with device /dev/sdc (key=0x1a740001)
May 3 13:21:31 node1 scsi_reserve: [info] registered with device /dev/sdd (key=0x1a740001)
May 3 13:21:31 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-10 (key=0x1a740001)
May 3 13:21:31 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-12 (key=0x1a740001)
May 3 13:21:31 node1 scsi_reserve: [info] 2 errors during registration
May 3 13:21:31 node1 scsi_reserve: [info] leaving the fence domain
May 3 13:38:32 node1 scsi_reserve: [info] removed registration on /dev/sdc (key=0x1a740001)
- Why was my cluster able to successfully use fence_scsi with partitioned devices in RHEL 5.6 and earlier, but no longer can in later releases?
- When is the secondary node that must fence the primary, it's returning a success, but instead it is not worked, and there is not a real fence/failover.
Environment
- Red Hat Enterprise Linux (RHEL) 5, 6, or 7 with the High Availability Add On
- Cluster configured with SCSI Persistent Reservation Fencing (
fence_scsi
) - Shared devices with partitions
- The
fence_scsi
fence/stonith device is either:- Configured with a
devices
attribute that includes partitions in the list, or - Does not have a
devices
attribute configured and there are shared, clustered volume groups in this cluster that contain one or more PVs that reside on a partition instead of an entire device
- Configured with a
- The
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.