Ceph peering process stalls when an OSD is down, and the cluster won't recover to a healthy state, why?

Solution In Progress - Updated -

Issue

  • During the peering process Ceph may require information from an OSD which is currently down or has been removed from the cluster.

  • When this happens Ceph will wait for the osd to return and the peering process will stall, leaving placement groups in an inactive state.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content