ODF: Rook-Ceph-MGR pod in CLBO after ODF upgrade - thread_name: (MonClient::_add_conns()+0x242)

Solution Verified - Updated -

Issue

Rook-Ceph-MGR pod in CLBO after ODF upgrade - thread_name: (MonClient::_add_conns()+0x242)

After ODF upgrade from 4.12.6 to to ODF 4.13.2 , ceph-mgr pod is in constant CLBO

$ oc get pods | grep mgr
rook-ceph-mgr-a-6b9f447d68-sgw4n                                  1/2     CrashLoopBackOff   8 (3m22s ago)   21m

Crash Backtrace:

/usr/include/c++/11/bits/random.tcc:2667: void std::discrete_distribution<_IntType>::param_type::_M_initialize() [with _IntType = int]: Assertion '__sum > 0' failed.
*** Caught signal (Aborted) **
 in thread 7f7ce49d6640 thread_name:mgr-fin
 ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable)
 1: /lib64/libc.so.6(+0x54df0) [0x7f7d00636df0]
 2: /lib64/libc.so.6(+0xa154c) [0x7f7d0068354c]
 3: raise()
 4: abort()
 5: /usr/lib64/ceph/libceph-common.so.2(+0x1c2f08) [0x7f7d00cccf08]
 6: /usr/lib64/ceph/libceph-common.so.2(+0x444935) [0x7f7d00f4e935]
 7: /usr/lib64/ceph/libceph-common.so.2(+0x4447f0) [0x7f7d00f4e7f0]
 8: (MonClient::_add_conns()+0x242) [0x7f7d00f48f32]
 9: (MonClient::_reopen_session(int)+0x428) [0x7f7d00f49a08]
 10: (Mgr::init()+0x384) [0x55b33e92aad4]
 11: ceph-mgr(+0x1ae911) [0x55b33e932911]
 12: ceph-mgr(+0x11376d) [0x55b33e89776d]
 13: (Finisher::finisher_thread_entry()+0x175) [0x7f7d00cfa575]
 14: /lib64/libc.so.6(+0x9f802) [0x7f7d00681802]
 15: /lib64/libc.so.6(+0x3f450) [0x7f7d00621450]
debug 2023-08-29T12:09:54.420+0000 7f7ce49d6640 -1 *** Caught signal (Aborted) **
 in thread 7f7ce49d6640 thread_name:mgr-fin

 ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable)
 1: /lib64/libc.so.6(+0x54df0) [0x7f7d00636df0]
 2: /lib64/libc.so.6(+0xa154c) [0x7f7d0068354c]
 3: raise()
 4: abort()
 5: /usr/lib64/ceph/libceph-common.so.2(+0x1c2f08) [0x7f7d00cccf08]
 6: /usr/lib64/ceph/libceph-common.so.2(+0x444935) [0x7f7d00f4e935]
 7: /usr/lib64/ceph/libceph-common.so.2(+0x4447f0) [0x7f7d00f4e7f0]
 8: (MonClient::_add_conns()+0x242) [0x7f7d00f48f32]
 9: (MonClient::_reopen_session(int)+0x428) [0x7f7d00f49a08]
 10: (Mgr::init()+0x384) [0x55b33e92aad4]
 11: ceph-mgr(+0x1ae911) [0x55b33e932911]
 12: ceph-mgr(+0x11376d) [0x55b33e89776d]
 13: (Finisher::finisher_thread_entry()+0x175) [0x7f7d00cfa575]
 14: /lib64/libc.so.6(+0x9f802) [0x7f7d00681802]
 15: /lib64/libc.so.6(+0x3f450) [0x7f7d00621450]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Because there is no active Ceph MGR, many Ceph commands are failing and the ODF is not operative.

Environment

Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Red Hat Ceph Storage (RHCS) 7.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content