Ceph: MDS crash with "ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())"

Solution Verified - Updated -

Issue

ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())

Trace back / Stack trace / Crash Signature of the issue:

2023-11-08T19:14:07.958+0000 7f0dde30b700  1 mds.ocs-storagecluster-cephfilesystem-a asok_command: status {prefix=status} (starting...)
2023-11-08T19:14:14.083+0000 7f0dd62fb700 -1 /builddir/build/BUILD/ceph-16.2.10/src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7f0dd62fb700 time 2023-11-08T19:14:14.082204+0000
/builddir/build/BUILD/ceph-16.2.10/src/mds/SimpleLock.h: 420: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())

 ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f0de4f55534]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x27974e) [0x7f0de4f5574e]
 3: (SimpleLock::put_xlock()+0x10e) [0x5569f77a356e]
 4: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x94) [0x5569f7790f04]
 5: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x20a) [0x5569f77915aa]
 6: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5e) [0x5569f7791b9e]
 7: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x2c2) [0x5569f75ff742]
 8: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x238) [0x5569f76004a8]
 9: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x3ba) [0x5569f761d61a]
 10: (MDSContext::complete(int)+0x203) [0x5569f78b7f73]
 11: (MDSIOContextBase::complete(int)+0x6ac) [0x5569f78b871c]
 12: (MDSLogContextBase::complete(int)+0x44) [0x5569f78b89a4]
 13: (Finisher::finisher_thread_entry()+0x1a5) [0x7f0de4ff7a15]
 14: /lib64/libpthread.so.0(+0x81ca) [0x7f0de3f341ca]
 15: clone()

2023-11-08T19:14:14.085+0000 7f0dd62fb700 -1 *** Caught signal (Aborted) **
 in thread 7f0dd62fb700 thread_name:MR_Finisher

 ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)
 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f0de3f3ecf0]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f0de4f55585]
 5: /usr/lib64/ceph/libceph-common.so.2(+0x27974e) [0x7f0de4f5574e]
 6: (SimpleLock::put_xlock()+0x10e) [0x5569f77a356e]
 7: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x94) [0x5569f7790f04]
 8: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x20a) [0x5569f77915aa]
 9: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5e) [0x5569f7791b9e]
 10: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x2c2) [0x5569f75ff742]
 11: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x238) [0x5569f76004a8]
 12: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x3ba) [0x5569f761d61a]
 13: (MDSContext::complete(int)+0x203) [0x5569f78b7f73]
 14: (MDSIOContextBase::complete(int)+0x6ac) [0x5569f78b871c]
 15: (MDSLogContextBase::complete(int)+0x44) [0x5569f78b89a4]
 16: (Finisher::finisher_thread_entry()+0x1a5) [0x7f0de4ff7a15]
 17: /lib64/libpthread.so.0(+0x81ca) [0x7f0de3f341ca]
 18: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
 -9999> 2023-11-08T19:14:08.073+0000 7f0dd52f9700  5 mds.0.log _submit_thread 270645933789557~2176 : EUpdate cap update [metablob 0x100, 2 dirs]
 -9998> 2023-11-08T19:14:08.073+0000 7f0dd52f9700  5 mds.0.log _submit_thread 270645933791753~2176 : EUpdate cap update [metablob 0x100, 2 dirs]

Environment

Red Hat OpenShift Container Platform (OCP) 4.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Red Hat Ceph Storage (RHCS) 7.0.x
Ceph File System (CephFS)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content