Cluster fails to unfence a node or start scsi_reserve when using shared devices with partitions in a RHEL High Availability cluster with fence_scsi

Solution Unverified - Updated -

Issue

  • When starting cman, the service fails at "Unfencing self", and the logs show fence_node: unfence failed
Jul 28 12:09:11 node1 fence_node[31482]: unfence node1.example.com failed
  • When attempting to start scsi_reserve on a cluster node, I see errors
Active clustered Logical Volumes: /dev/vg_ha/lvol1 /dev/vg_gfs/lvol1  /dev/vg_gfs/lvol2 /dev/vg_gfs/lvol3
persistent reservation in: pass through os error: Inappropriate ioctl for device
PR in: command failed
persistent reservation in: pass through os error: Inappropriate ioctl for device
PR in: command failed
No registered devices found.
  • I get reservation errors in /var/log/messages when starting a cluster
May  3 13:19:37 node1 scsi_reserve: [info] registered with device /dev/sdc (key=0x1a740001)
May  3 13:19:37 node1 scsi_reserve: [info] registered with device /dev/sdd (key=0x1a740001)
May  3 13:19:37 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-10 (key=0x1a740001)
May  3 13:19:37 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-12 (key=0x1a740001)
May  3 13:19:37 node1 scsi_reserve: [info] 2 errors during registration
May  3 13:19:37 node1 scsi_reserve: [info] leaving the fence domain
May  3 13:21:25 node1 scsi_reserve: [info] unable to remove registration on /dev/sdc (key=0x1a740001)
May  3 13:21:25 node1 scsi_reserve: [info] unable to remove registration on /dev/sdd (key=0x1a740001)
May  3 13:21:31 node1 scsi_reserve: [info] registered with device /dev/sdc (key=0x1a740001)
May  3 13:21:31 node1 scsi_reserve: [info] registered with device /dev/sdd (key=0x1a740001)
May  3 13:21:31 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-10 (key=0x1a740001)
May  3 13:21:31 node1 scsi_reserve: [error] unable to create reservation on /dev/dm-12 (key=0x1a740001)
May  3 13:21:31 node1 scsi_reserve: [info] 2 errors during registration
May  3 13:21:31 node1 scsi_reserve: [info] leaving the fence domain
May  3 13:38:32 node1 scsi_reserve: [info] removed registration on /dev/sdc (key=0x1a740001)
  • Why was my cluster able to successfully use fence_scsi with partitioned devices in RHEL 5.6 and earlier, but no longer can in later releases?
  • When is the secondary node that must fence the primary, it's returning a success, but instead it is not worked, and there is not a real fence/failover.

Environment

  • Red Hat Enterprise Linux (RHEL) 5, 6, or 7 with the High Availability Add On
  • Cluster configured with SCSI Persistent Reservation Fencing (fence_scsi)
  • Shared devices with partitions
    • The fence_scsi fence/stonith device is either:
      • Configured with a devices attribute that includes partitions in the list, or
      • Does not have a devices attribute configured and there are shared, clustered volume groups in this cluster that contain one or more PVs that reside on a partition instead of an entire device

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content