fence_sbd using multiple poison pill devices fails on all nodes when one device is removed
Issue
- sbd detects that /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179 cannot be opened and logs that it will try other configured devices:
Dec 6 15:41:05 node42 pacemaker-schedulerd[986347]: notice: * Start sbd ( node42 )
[....]
Dec 6 15:41:07 node42 sbd[986453]: warning: open_device: Opening device /dev/disk/by-id/../../sdb failed.
Dec 6 15:41:07 node42 /fence_sbd[986448]: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized
Dec 6 15:41:07 node42 pacemaker-fenced[986344]: notice: fence_sbd_monitor_1[986448] error output [ 2021-12-06 15:41:07,765 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec 6 15:41:07 node42 pacemaker-fenced[986344]: warning: fence_sbd[986448] stderr: [ 2021-12-06 15:41:07,765 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec 6 15:41:08 node42 sbd[986467]: warning: open_device: Opening device /dev/disk/by-id/../../sdb failed.
Dec 6 15:41:08 node42 /fence_sbd[986461]: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized
Dec 6 15:41:08 node42 pacemaker-fenced[986344]: notice: fence_sbd_monitor_2[986461] error output [ 2021-12-06 15:41:08,866 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec 6 15:41:08 node42 pacemaker-fenced[986344]: warning: fence_sbd[986461] stderr: [ 2021-12-06 15:41:08,866 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec 6 15:41:08 node42 pacemaker-fenced[986344]: notice: Operation 'monitor' [986461] for device 'sbd' returned: -201 (Generic Pacemaker error)
Dec 6 15:41:08 node42 pacemaker-controld[986348]: notice: Result of start operation for sbd on node42: error
Dec 6 15:41:08 node42 pacemaker-controld[986348]: notice: Transition 0 aborted by operation sbd_start_0 'modify' on node42: Event failed
Dec 6 15:41:08 node42 pacemaker-controld[986348]: notice: Transition 0 action 18 (sbd_start_0 on node42): expected 'ok' but got 'error'
Dec 6 15:41:08 node42 pacemaker-attrd[986346]: notice: Setting fail-count-sbd#start_0[node42]: (unset) -> INFINITY
Dec 6 15:41:08 node42 pacemaker-attrd[986346]: notice: Setting last-failure-sbd#start_0[node42]: (unset) -> 1638801668
Dec 6 15:41:08 node42 pacemaker-controld[986348]: notice: Transition 0 aborted by transient_attributes.1 'create': Transient attribute change
Dec 6 15:41:09 node42 sbd[986474]: /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179: warning: open_device: Opening device /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179 failed.
Dec 6 15:41:09 node42 sbd[986317]: warning: cleanup_servant_by_pid: Servant for /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179 (pid: 986474) has terminated
Dec 6 15:41:09 node42 pacemaker-controld[986348]: notice: Transition 0 (Complete=15, Pending=0, Fired=0, Skipped=1, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-100.bz2): Stopped
Dec 6 15:41:09 node42 pacemaker-schedulerd[986347]: warning: Unexpected result (error) was recorded for start of sbd on node42 at Dec 6 15:41:07 2021
Environment
- Red Hat Enterprise Linux Server 7, 8 (with the High Availability Add On)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.