ODF: MDS pods in CrashLoopBackOff (CLBO) and "EMetaBlob.replay" and "sessionmap" are in the traceback.

Solution Verified - Updated 2024-06-13T19:43:54+00:00 -

Issue

MDS pods in CrashLoopBackOff (CLBO) and EMetaBlob.replay and sessionmap are in the traceback.

The Ceph MDS service is in CrashLoopBackOff (CLBO). To continue with this article, the signature of the crash must be similar to the example below.

$ oc get pods | grep rook-ceph-mds-ocs
NAME                                                             READY  STATUS   RESTARTS  AGE
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-b9bd569fbdkk5  1/2    Running  384       1d
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5bbccc9d88zs5  1/2    Running  383       1d

$ oc get events
Pod openshift-storage/rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5bbccc9d88zs5 (mds) is in waiting state (reason: "CrashLoopBackOff")
Pod openshift-storage/rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-b9bd569fbdkk5 (mds) is in waiting state (reason: "CrashLoopBackOff")

Crash signature:
Please note EMetaBlob.replay sessionmap in the traceback:

$ ceph crash ls
$ ceph crash info <crash-id>

debug 2023-01-13 01:22:37.619 7fddbca74700  1 mds.0.175006  waiting for osdmap 47157 (which blacklists prior instance)
debug 2023-01-13 01:22:37.638 7fddb6267700  0 mds.0.cache creating system inode with ino:0x100
debug 2023-01-13 01:22:37.638 7fddb6267700  0 mds.0.cache creating system inode with ino:0x1
/builddir/build/BUILD/ceph-14.2.11/src/mds/journal.cc: In function 'void EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 7fddb4a64700 time 20
23-01-13 01:22:37.721236
/builddir/build/BUILD/ceph-14.2.11/src/mds/journal.cc: 1551: FAILED ceph_assert(g_conf()->mds_wipe_sessions)
debug 2023-01-13 01:22:37.719 7fddb4a64700 -1 log_channel(cluster) log [ERR] : EMetaBlob.replay sessionmap v 145397255 - 1 > table 0
 ceph version 14.2.11-208.el8cp (6738ba96f296a41c24357c12e8d594fbde457abc) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x7fddc6847308]
 2: (()+0x275522) [0x7fddc6847522]
 3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x6b54) [0x55b8999231b4]
 4: (EUpdate::replay(MDSRank*)+0x40) [0x55b899925740]
 5: (MDLog::_replay_thread()+0xbee) [0x55b8998c49ae]
 6: (MDLog::ReplayThread::entry()+0x11) [0x55b8996299c1]
 7: (()+0x817a) [0x7fddc462717a]
 8: (clone()+0x43) [0x7fddc313edc3]
*** Caught signal (Aborted) **
 in thread 7fddb4a64700 thread_name:md_log_replay
debug 2023-01-13 01:22:37.720 7fddb4a64700 -1 /builddir/build/BUILD/ceph-14.2.11/src/mds/journal.cc: In function 'void EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 7fddb4a64700 time 2023-01-13 01:22:37.721236
/builddir/build/BUILD/ceph-14.2.11/src/mds/journal.cc: 1551: FAILED ceph_assert(g_conf()->mds_wipe_sessions)

Environment

Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Cluster Platform (OCP) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

ODF: MDS pods in CrashLoopBackOff (CLBO) and "EMetaBlob.replay" and "sessionmap" are in the traceback.

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links