Ceph peering process stalls when an OSD is down, and the cluster won't recover to a healthy state, why?
Issue
-
During the peering process Ceph may require information from an OSD which is currently down or has been removed from the cluster.
-
When this happens Ceph will wait for the osd to return and the peering process will stall, leaving placement groups in an inactive state.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.