How to manually replace a failing Ceph OSD disk that is being used as block.db device on a Red Hat OpenStack cluster
Issue
- Given an environment with a deployed RHOSP 16.1 overcloud with containerized Ceph 4.2 using Satellite for all the clusters' repositories and container images, how can a failing device - that is being used as a
block.db
device, i.e.Bluestore
-, and all its affected OSDs, be replaced manually? - The official RHOSP documentation for replacing a failed disk when using containerized Ceph deployed through Director does not provide sufficient information about how to do that when using one or more specific devices (e.g. one or more NVMe devices specified as
dedicated_devices
for a set of OSD disk devices in thenode-spec-overrides.yml
file) as ablock.db
device.
Environment
- Red Hat OpenStack Platform (RHOSP)
- 16.1
- Red Hat Ceph Storage (RHCS)
- 4.2
- Red Hat Satellite
- 6.8 (onwards)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.