Pacemaker disk exclusive control: SFEX or fence_scsi on VMware
Hello there,
Please let me hear your thoughts or experiences with regard to which disk exclusive control method to use, SFEX or fence_scsi (SPC-3 PR reservation).
We are planning to install RHEL7.4 HA Add-On clusters with Pacemaker on several VMware vSphere 6.5 hosts. Each of the HA clusters consists of two VM nodes of active and standby across two different hosts, with shared LVM.
As for the disk exclusive control and fencing mechanism with Pacemaker, our IT vendor is proposing to use SFEX (Shared Disk File EXclusiveness) and fence_vmware_soap (to reset the failing node via vCenter).
However, I am concerned about a case of an ESX host hanging for over a minute like due to intermittent HW failures, so fence_vmware_soap would not work. Forcing the standby node to takeover the disk resources with SFEX, but if the hanging node comes back eventually, the hanged I/Os that were queued before the ESX hanged would flood over and corrupt the SFEX-takenover disk resources because there was no SCSI PR reservation.
So, I think we should use fence_scsi if the disk storage is SPC-3 compliant and persistent reservation capable. Am I right?
Or is SFEX a still valid choice recently?
Is fence_scsi proven enough in production enviroment on RHEL7x?
Thank you for any responses, Satoshi
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
