Ceph/ODF: MGR is unresponsive due to commands being blocked from the "volumes plugin" and "Ceph-MGR finisher thread".
Issue
The Ceph-MGR is unresponsive due to commands being blocked from the volumes plugin
and Ceph-MGR finisher thread
.
The Ceph-MGR is not servicing requests resulting in these symptoms:
- Running
ceph osd df tree
hangs, never completes - The output from
ceph status
does not match reality - The PG state in
ceph status
does not match the PG state seen inceph pg query
. - Ceph is slow due to the mgr not responding
ceph daemon DAEMON_NAME perf dump > 1000 entries in the get_or_fail_fail
queue:
"throttle-mgr_mon_messsages": {
"val": 128,
"max": 128,
"get_started": 0,
"get": 139,
"get_sum": 139,
"get_or_fail_fail": 10941044, <-- Here
"get_or_fail_success": 139,
"take": 0,
"take_sum": 0,
"put": 11,
"put_sum": 11,
"wait": {
"avgcount": 0,
"sum": 0.000000000,
"avgtime": 0.000000000
}
},
Environment
Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.