OCS/ODF OSD pod fails to start with _read_bdev_label failed to read from - (5) Input/output error /dev/loop was used instead of the correct disk /dev/sd<x>

Solution Verified - Updated -

Issue

  • During an OCP upgrade, and after an OCS node restarts, the OSD pod running on that node fails to start

  • OSD pod is not in a Ready state and continuously restarting, example:

rook-ceph-osd-0-xxxxxxxxx-xxxx                                 1/2    Running  4         2m48s 
  • OSD container fails to start with failed to activate osd: exit status 1 error message similar to the following example:
  - containerID: cri-o://0bc5c5fa76528d1d1e6ec19a72492da3a35e817e616d9ebda8886e21672f99e5
    image: registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:fc25524ccb0ea78526257778ab54bfb1a25772b75fcc97df98eb06a0e67e1bf6
    imageID: registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:06255c43a5ccaec516969637a39d500a0354da26127779b5ee53dbe9c444339c
    lastState:
      terminated:
        containerID: cri-o://0bc5c5fa76528d1d1e6ec19a72492da3a35e817e616d9ebda8886e21672f99e5
        exitCode: 1
        finishedAt: '2023-05-09T07:09:09Z'
        message: 'failed to activate osd: exit status 1'   <-------
        reason: Error
        startedAt: '2023-05-09T07:09:08Z'
    name: osd
    ready: false
    restartCount: 4
    started: false
    state:
      waiting:
        message: back-off 1m20s restarting failed container=osd pod=rook-ceph-osd-0-xxxxxxxx-xxxx_openshift-storage(25f83a15-f2b1-4cd3-a59e-1e705ab13f7b)
        reason: CrashLoopBackOff
  • The OSD pod's logs show error message failed to read label for /dev/ceph-bca9a2a2-4cd8-4cea-b815-d687465a511e/osd-block-c76680aa-012f-431f-bc79-7d0f1a525f38: (5) Input/output error similar to the following example:
2023-05-09T07:09:08.650521788Z 2023-05-09 07:09:08.650379 I | rookcmd: starting Rook v4.10.12-0.abc959dfc624825fe30ac1bc627c216f27d70203 with arguments '/rook/rook ceph osd start -- --foreground --id 0 --fsid 7b9a88e6-efed-4069-816e-cc5c4178c298 
--cluster ceph --setuser ceph --setgroup ceph --crush-location=root=default host=ocs-deviceset-0-0-hg66z rack=rack0 --log-to-stderr=true --err-to-stderr=true --mon-cluster-log-to-stderr=true --log-stderr-prefix=debug  --default-log-to-file=false 
--default-mon-cluster-log-to-file=false --ms-learn-addr-from-peer=false'
2023-05-09T07:09:08.650521788Z 2023-05-09 07:09:08.650471 I | rookcmd: flag values: --block-path=/dev/ceph-bca9a2a2-4cd8-4cea-b815-d687465a511e/osd-block-c76680aa-012f-431f-bc79-7d0f1a525f38, --help=false, --log-level=INFO, --lv-backed-pv=true, -
-operator-image=, --osd-id=0, --osd-store-type=, --osd-uuid=c76680aa-012f-431f-bc79-7d0f1a525f38, --pvc-backed-osd=true, --service-account=
2023-05-09T07:09:08.650521788Z 2023-05-09 07:09:08.650475 I | op-mon: parsing mon endpoints: b=172.30.x.1:6789,c=172.30.x.2:6789,a=172.30.x.3:6789
2023-05-09T07:09:08.655060124Z 2023-05-09 07:09:08.655014 I | cephosd: Successfully updated lvm config file "/etc/lvm/lvm.conf"
2023-05-09T07:09:09.017702952Z 2023-05-09 07:09:09.017641 I | exec: Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
2023-05-09T07:09:09.017702952Z 2023-05-09 07:09:09.017669 I | exec: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
2023-05-09T07:09:09.017702952Z 2023-05-09 07:09:09.017675 I | exec: Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-bca9a2a2-4cd8-4cea-b815-d687465a511e/osd-block-c76680aa-012f-431f-bc79-7d0f1a525f38 --p
ath /var/lib/ceph/osd/ceph-0 --no-mon-config
2023-05-09T07:09:09.017702952Z 2023-05-09 07:09:09.017680 I | exec:  stderr: failed to read label for /dev/ceph-bca9a2a2-4cd8-4cea-b815-d687465a511e/osd-block-c76680aa-012f-431f-bc79-7d0f1a525f38: (5) Input/output error
2023-05-09T07:09:09.017702952Z 2023-05-09 07:09:09.017683 I | exec:  stderr: 2023-05-09T07:09:09.014+0000 7fb46515e3c0 -1 bluestore(/dev/ceph-bca9a2a2-4cd8-4cea-b815-d687465a511e/osd-block-c76680aa-012f-431f-bc79-7d0f1a525f38) _read_bdev_label failed to read from /dev/ceph-bca9a2a2-4cd8-4cea-b815-d687465a511e/osd-block-c76680aa-012f-431f-bc79-7d0f1a525f38: (5) Input/output error
2023-05-09T07:09:09.018374703Z 2023-05-09 07:09:09.018351 I | exec: Traceback (most recent call last):
2023-05-09T07:09:09.018374703Z 2023-05-09 07:09:09.018366 I | exec:   File "/usr/sbin/ceph-volume", line 11, in <module>
2023-05-09T07:09:09.018374703Z 2023-05-09 07:09:09.018368 I | exec:     load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
2023-05-09T07:09:09.018374703Z 2023-05-09 07:09:09.018371 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 40, in __init__
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018373 I | exec:     self.main(self.argv)
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018375 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018377 I | exec:     return f(*a, **kw)
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018379 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 152, in main
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018383 I | exec:     terminal.dispatch(self.mapper, subcommand_args)
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018385 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
2023-05-09T07:09:09.018393497Z 2023-05-09 07:09:09.018387 I | exec:     instance.main()
2023-05-09T07:09:09.018429981Z 2023-05-09 07:09:09.018413 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main
2023-05-09T07:09:09.018429981Z 2023-05-09 07:09:09.018420 I | exec:     terminal.dispatch(self.mapper, self.argv)
2023-05-09T07:09:09.018429981Z 2023-05-09 07:09:09.018425 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018428 I | exec:     instance.main()
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018431 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 377, in main
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018433 I | exec:     self.activate(args)
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018435 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018438 I | exec:     return func(*a, **kw)
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018441 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 301, in activate
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018443 I | exec:     activate_bluestore(lvs, args.no_systemd)
2023-05-09T07:09:09.018448608Z 2023-05-09 07:09:09.018445 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 196, in activate_bluestore
2023-05-09T07:09:09.018454947Z 2023-05-09 07:09:09.018447 I | exec:     process.run(prime_command)
2023-05-09T07:09:09.018454947Z 2023-05-09 07:09:09.018449 I | exec:   File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", line 162, in run
2023-05-09T07:09:09.018454947Z 2023-05-09 07:09:09.018451 I | exec:     raise RuntimeError(msg)
2023-05-09T07:09:09.018459824Z 2023-05-09 07:09:09.018454 I | exec: RuntimeError: command returned non-zero exit status: 1
2023-05-09T07:09:09.049960399Z 2023-05-09 07:09:09.049812 C | rookcmd: failed to activate osd: exit status 1

Environment

  • OpenShift Container Platform (OCP) 4.9 or higher
  • OpenShift Data Foundation (ODF) 4.9 or higher

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content