Upgrading ceph cluster failed with error "failed to schedule mon "d". failed to schedule canary pod(s)" when upgrading internal ODF
Issue
- Upgrading internal ODF failed, storagecluster stuck in "phase: Error" with below error:
lastHeartbeatTime: "2025-09-04T06:02:06Z"
lastTransitionTime: "2025-09-04T06:01:37Z"
message: 'CephCluster error: failed to create cluster: failed to start ceph
monitors: failed to assign pods to mons: failed to schedule mons'
reason: ClusterStateError
status: "True"
type: Degraded
- Pod rook-ceph-mon-d-canary is pending with below error while Pod rook-ceph-mon-d is running normally:
$ oc describe pod rook-ceph-mon-d-canary-76475b4f55-qkrb6 -n openshift-storage
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m20s default-scheduler 0/8 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) didn't match pod anti-affinity rules, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/8 nodes are available: 3 node(s) didn't match pod anti-affinity rules, 5 Preemption is not helpful for scheduling.
Warning FailedScheduling 2m16s (x2 over 2m18s) default-scheduler 0/8 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) didn't match pod anti-affinity rules, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/8 nodes are available: 3 node(s) didn't match pod anti-affinity rules, 5 Preemption is not helpful for scheduling.
$ oc get po rook-ceph-mon-d-7bbb948b79-n7vzk
NAME READY STATUS RESTARTS AGE
rook-ceph-mon-d-7bbb948b79-n7vzk 2/2 Running 0 23h
Environment
- Red Hat OpenShift Data Foundation 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.