ODF: Rook-Ceph-MGR pod in CLBO after ODF upgrade - thread_name: (MonClient::_add_conns()+0x242)
Issue
Rook-Ceph-MGR pod in CLBO after ODF upgrade - thread_name: (MonClient::_add_conns()+0x242)
After ODF upgrade from 4.12.6 to to ODF 4.13.2 , ceph-mgr pod is in constant CLBO
$ oc get pods | grep mgr
rook-ceph-mgr-a-6b9f447d68-sgw4n 1/2 CrashLoopBackOff 8 (3m22s ago) 21m
Crash Backtrace:
/usr/include/c++/11/bits/random.tcc:2667: void std::discrete_distribution<_IntType>::param_type::_M_initialize() [with _IntType = int]: Assertion '__sum > 0' failed.
*** Caught signal (Aborted) **
in thread 7f7ce49d6640 thread_name:mgr-fin
ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable)
1: /lib64/libc.so.6(+0x54df0) [0x7f7d00636df0]
2: /lib64/libc.so.6(+0xa154c) [0x7f7d0068354c]
3: raise()
4: abort()
5: /usr/lib64/ceph/libceph-common.so.2(+0x1c2f08) [0x7f7d00cccf08]
6: /usr/lib64/ceph/libceph-common.so.2(+0x444935) [0x7f7d00f4e935]
7: /usr/lib64/ceph/libceph-common.so.2(+0x4447f0) [0x7f7d00f4e7f0]
8: (MonClient::_add_conns()+0x242) [0x7f7d00f48f32]
9: (MonClient::_reopen_session(int)+0x428) [0x7f7d00f49a08]
10: (Mgr::init()+0x384) [0x55b33e92aad4]
11: ceph-mgr(+0x1ae911) [0x55b33e932911]
12: ceph-mgr(+0x11376d) [0x55b33e89776d]
13: (Finisher::finisher_thread_entry()+0x175) [0x7f7d00cfa575]
14: /lib64/libc.so.6(+0x9f802) [0x7f7d00681802]
15: /lib64/libc.so.6(+0x3f450) [0x7f7d00621450]
debug 2023-08-29T12:09:54.420+0000 7f7ce49d6640 -1 *** Caught signal (Aborted) **
in thread 7f7ce49d6640 thread_name:mgr-fin
ceph version 17.2.6-100.el9cp (ea4e3ef8df2cf26540aae06479df031dcfc80343) quincy (stable)
1: /lib64/libc.so.6(+0x54df0) [0x7f7d00636df0]
2: /lib64/libc.so.6(+0xa154c) [0x7f7d0068354c]
3: raise()
4: abort()
5: /usr/lib64/ceph/libceph-common.so.2(+0x1c2f08) [0x7f7d00cccf08]
6: /usr/lib64/ceph/libceph-common.so.2(+0x444935) [0x7f7d00f4e935]
7: /usr/lib64/ceph/libceph-common.so.2(+0x4447f0) [0x7f7d00f4e7f0]
8: (MonClient::_add_conns()+0x242) [0x7f7d00f48f32]
9: (MonClient::_reopen_session(int)+0x428) [0x7f7d00f49a08]
10: (Mgr::init()+0x384) [0x55b33e92aad4]
11: ceph-mgr(+0x1ae911) [0x55b33e932911]
12: ceph-mgr(+0x11376d) [0x55b33e89776d]
13: (Finisher::finisher_thread_entry()+0x175) [0x7f7d00cfa575]
14: /lib64/libc.so.6(+0x9f802) [0x7f7d00681802]
15: /lib64/libc.so.6(+0x3f450) [0x7f7d00621450]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Because there is no active Ceph MGR, many Ceph commands are failing and the ODF is not operative.
Environment
Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Red Hat Ceph Storage (RHCS) 7.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.