fence_sbd using multiple poison pill devices fails on all nodes when one device is removed

Solution In Progress - Updated -

Issue

  • sbd detects that /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179 cannot be opened and logs that it will try other configured devices:
Dec  6 15:41:05 node42 pacemaker-schedulerd[986347]: notice:  * Start      sbd               ( node42 )
[....]
Dec  6 15:41:07 node42 sbd[986453]: warning: open_device: Opening device /dev/disk/by-id/../../sdb failed.
Dec  6 15:41:07 node42 /fence_sbd[986448]: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized
Dec  6 15:41:07 node42 pacemaker-fenced[986344]: notice: fence_sbd_monitor_1[986448] error output [ 2021-12-06 15:41:07,765 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec  6 15:41:07 node42 pacemaker-fenced[986344]: warning: fence_sbd[986448] stderr: [ 2021-12-06 15:41:07,765 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec  6 15:41:08 node42 sbd[986467]: warning: open_device: Opening device /dev/disk/by-id/../../sdb failed.
Dec  6 15:41:08 node42 /fence_sbd[986461]: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized
Dec  6 15:41:08 node42 pacemaker-fenced[986344]: notice: fence_sbd_monitor_2[986461] error output [ 2021-12-06 15:41:08,866 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec  6 15:41:08 node42 pacemaker-fenced[986344]: warning: fence_sbd[986461] stderr: [ 2021-12-06 15:41:08,866 ERROR: "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179" is not initialized ]
Dec  6 15:41:08 node42 pacemaker-fenced[986344]: notice: Operation 'monitor' [986461] for device 'sbd' returned: -201 (Generic Pacemaker error)
Dec  6 15:41:08 node42 pacemaker-controld[986348]: notice: Result of start operation for sbd on node42: error
Dec  6 15:41:08 node42 pacemaker-controld[986348]: notice: Transition 0 aborted by operation sbd_start_0 'modify' on node42: Event failed
Dec  6 15:41:08 node42 pacemaker-controld[986348]: notice: Transition 0 action 18 (sbd_start_0 on node42): expected 'ok' but got 'error'
Dec  6 15:41:08 node42 pacemaker-attrd[986346]: notice: Setting fail-count-sbd#start_0[node42]: (unset) -> INFINITY
Dec  6 15:41:08 node42 pacemaker-attrd[986346]: notice: Setting last-failure-sbd#start_0[node42]: (unset) -> 1638801668
Dec  6 15:41:08 node42 pacemaker-controld[986348]: notice: Transition 0 aborted by transient_attributes.1 'create': Transient attribute change
Dec  6 15:41:09 node42 sbd[986474]: /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179:  warning: open_device: Opening device /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179 failed.
Dec  6 15:41:09 node42 sbd[986317]: warning: cleanup_servant_by_pid: Servant for /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_71345cdd-ae1d-4c92-b164-dfc9a1f93179 (pid: 986474) has terminated
Dec  6 15:41:09 node42 pacemaker-controld[986348]: notice: Transition 0 (Complete=15, Pending=0, Fired=0, Skipped=1, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-100.bz2): Stopped
Dec  6 15:41:09 node42 pacemaker-schedulerd[986347]: warning: Unexpected result (error) was recorded for start of sbd on node42 at Dec  6 15:41:07 2021

Environment

  • Red Hat Enterprise Linux Server 7, 8 (with the High Availability Add On)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content