Ceph: MDS crash with "ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())"
Issue
ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())
Trace back / Stack trace / Crash Signature of the issue:
2023-11-08T19:14:07.958+0000 7f0dde30b700 1 mds.ocs-storagecluster-cephfilesystem-a asok_command: status {prefix=status} (starting...)
2023-11-08T19:14:14.083+0000 7f0dd62fb700 -1 /builddir/build/BUILD/ceph-16.2.10/src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7f0dd62fb700 time 2023-11-08T19:14:14.082204+0000
/builddir/build/BUILD/ceph-16.2.10/src/mds/SimpleLock.h: 420: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())
ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f0de4f55534]
2: /usr/lib64/ceph/libceph-common.so.2(+0x27974e) [0x7f0de4f5574e]
3: (SimpleLock::put_xlock()+0x10e) [0x5569f77a356e]
4: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x94) [0x5569f7790f04]
5: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x20a) [0x5569f77915aa]
6: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5e) [0x5569f7791b9e]
7: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x2c2) [0x5569f75ff742]
8: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x238) [0x5569f76004a8]
9: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x3ba) [0x5569f761d61a]
10: (MDSContext::complete(int)+0x203) [0x5569f78b7f73]
11: (MDSIOContextBase::complete(int)+0x6ac) [0x5569f78b871c]
12: (MDSLogContextBase::complete(int)+0x44) [0x5569f78b89a4]
13: (Finisher::finisher_thread_entry()+0x1a5) [0x7f0de4ff7a15]
14: /lib64/libpthread.so.0(+0x81ca) [0x7f0de3f341ca]
15: clone()
2023-11-08T19:14:14.085+0000 7f0dd62fb700 -1 *** Caught signal (Aborted) **
in thread 7f0dd62fb700 thread_name:MR_Finisher
ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)
1: /lib64/libpthread.so.0(+0x12cf0) [0x7f0de3f3ecf0]
2: gsignal()
3: abort()
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f0de4f55585]
5: /usr/lib64/ceph/libceph-common.so.2(+0x27974e) [0x7f0de4f5574e]
6: (SimpleLock::put_xlock()+0x10e) [0x5569f77a356e]
7: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x94) [0x5569f7790f04]
8: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x20a) [0x5569f77915aa]
9: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5e) [0x5569f7791b9e]
10: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x2c2) [0x5569f75ff742]
11: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x238) [0x5569f76004a8]
12: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x3ba) [0x5569f761d61a]
13: (MDSContext::complete(int)+0x203) [0x5569f78b7f73]
14: (MDSIOContextBase::complete(int)+0x6ac) [0x5569f78b871c]
15: (MDSLogContextBase::complete(int)+0x44) [0x5569f78b89a4]
16: (Finisher::finisher_thread_entry()+0x1a5) [0x7f0de4ff7a15]
17: /lib64/libpthread.so.0(+0x81ca) [0x7f0de3f341ca]
18: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- begin dump of recent events ---
-9999> 2023-11-08T19:14:08.073+0000 7f0dd52f9700 5 mds.0.log _submit_thread 270645933789557~2176 : EUpdate cap update [metablob 0x100, 2 dirs]
-9998> 2023-11-08T19:14:08.073+0000 7f0dd52f9700 5 mds.0.log _submit_thread 270645933791753~2176 : EUpdate cap update [metablob 0x100, 2 dirs]
Environment
Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Red Hat Ceph Storage (RHCS) 7.0.x
Ceph File System (CephFS)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.