Ceph peering process stalls when an OSD is down, and the cluster won't recover to a healthy state, why?
Issue
-
During the peering process Ceph may require information from an OSD which is currently down or has been removed from the cluster.
-
When this happens Ceph will wait for the osd to return and the peering process will stall, leaving placement groups in an inactive state.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
