OCS / ODF OSD removal fails with error: unknown parameter name "FORCE_OSD_REMOVAL"
Issue
-
This is ODF 4.10.8 and we are trying to remove an OSD following steps from
Steps to replace failed OSD in Red Hat OpenShift Container Storage 4.X
or
in the ODF 4.10 Documentation : Replacing devices -
This command
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=3 |oc create -n openshift-storage -f -creates a job and a removal pod , but the removal pod
ocs-osd-removal-jobis running and does not complete. -
Find out why by looking into its logs
oc logs ocs-osd-removal-job2022-12-07 13:31:16.627495 D | exec: Running command: ceph osd dump --connect-timeout=15 --cluster=openshift-storage --conf=/var/lib/rook/openshift-storage/openshift-storage.config --name=client.admin --keyring=/var/lib/rook/openshift-storage/client.a dmin.keyring --format json 2022-12-07 13:31:19.427571 I | cephosd: validating status of osd.3 2022-12-07 13:31:19.427625 I | cephosd: osd.3 is marked 'DOWN'. Removing it 2022-12-07 13:31:19.427732 D | exec: Running command: ceph osd find 3 --connect-timeout=15 --cluster=openshift-storage --conf=/var/lib/rook/openshift-storage/openshift-storage.config --name=client.admin --keyring=/var/lib/rook/openshift-storage/client .admin.keyring --format json 2022-12-07 13:31:20.029956 D | exec: Running command: ceph osd out osd.3 --connect-timeout=15 --cluster=openshift-storage --conf=/var/lib/rook/openshift-storage/openshift-storage.config --name=client.admin --keyring=/var/lib/rook/openshift-storage/cli ent.admin.keyring --format json 2022-12-07 13:31:20.658128 I | cephosd: removing the OSD deployment "rook-ceph-osd-3" 2022-12-07 13:31:20.658174 D | op-k8sutil: removing rook-ceph-osd-3 deployment if it exists 2022-12-07 13:31:20.658186 I | op-k8sutil: removing deployment rook-ceph-osd-3 if it exists 2022-12-07 13:31:20.686597 I | op-k8sutil: Removed deployment rook-ceph-osd-3 2022-12-07 13:31:20.693035 I | op-k8sutil: "rook-ceph-osd-3" still found. waiting... 2022-12-07 13:31:22.733871 I | op-k8sutil: confirmed rook-ceph-osd-3 does not exist 2022-12-07 13:31:22.746526 I | cephosd: removing the osd prepare job "rook-ceph-osd-prepare-6f8f4e58014c4c06de3d8d181ee62d11" 2022-12-07 13:31:22.762293 I | cephosd: removing the OSD PVC "ocs-deviceset-volume01-0-data-10l8bdn" 2022-12-07 13:31:22.774728 D | exec: Running command: ceph osd purge osd.3 --force --yes-i-really-mean-it --connect-timeout=15 --cluster=openshift-storage --conf=/var/lib/rook/openshift-storage/openshift-storage.config --name=client.admin --keyring=/v ar/lib/rook/openshift-storage/client.admin.keyring --format jsonNotice this pod is waiting for the "ceph osd purge osd.3" command to finish.
-
Delete the job
oc delete job job_name, that will delete the removal job also, and try again with the optionFORCE_OSD_REMOVAL, but this may fail with this error:# oc process -n openshift-storage ocs-osd-removal -p FORCE_OSD_REMOVAL=true -p FAILED_OSD_IDS=3 | oc create -f - error: unknown parameter name "FORCE_OSD_REMOVAL" error: no objects passed to create #
Environment
- OCS 4.8
- ODF 4.9 and higher
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.