How to manually replace a failing Ceph OSD disk that is being used as block.db device on a Red Hat OpenStack cluster

Solution Verified - Updated 2024-06-13T19:38:16+00:00 -

Issue

Given an environment with a deployed RHOSP 16.1 overcloud with containerized Ceph 4.2 using Satellite for all the clusters' repositories and container images, how can a failing device - that is being used as a block.db device, i.e. Bluestore -, and all its affected OSDs, be replaced manually?
The official RHOSP documentation for replacing a failed disk when using containerized Ceph deployed through Director does not provide sufficient information about how to do that when using one or more specific devices (e.g. one or more NVMe devices specified as dedicated_devices for a set of OSD disk devices in the node-spec-overrides.yml file) as a block.db device.

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.