Ceph - Librbd-backed QEMU instances spontaneously crashing with the following failure: FAILED assert(m_ictx->owner_lock.is_locked())
Issue
- Since upgrading from Upstream Hammer 0.94.2 to 0.94.4 Librbd-backed QEMU instances are spontaneously crashing with the following failure: FAILED assert(m_ictx->owner_lock.is_locked()).
- Logs from the QEMU instance report the following:
librbd/LibrbdWriteback.cc: In function 'virtual ceph_tid_t librbd::LibrbdWriteback::write(const object_t&, const object_locator_t&, uint64_t, uint64_t, const SnapContext&, const bufferlist&, utime_t, uint64_t, __u32, Context*)' thread 7f28edffb700 time 2015-10-20 11:49:08.120786
librbd/LibrbdWriteback.cc: 160: FAILED assert(m_ictx->owner_lock.is_locked())
ceph version 0.94.4 (95292699291242794510b39ffde3f4df67898d3a)
1: (()+0x17258b) [0x7f291798858b]
2: (()+0xa9573) [0x7f29178bf573]
3: (()+0x3a90ca) [0x7f2917bbf0ca]
4: (()+0x3b583d) [0x7f2917bcb83d]
5: (()+0x7212c) [0x7f291788812c]
6: (()+0x9590f) [0x7f29178ab90f]
7: (()+0x969a3) [0x7f29178ac9a3]
8: (()+0x4782a) [0x7f291785d82a]
9: (()+0x56599) [0x7f291786c599]
10: (()+0x7284e) [0x7f291788884e]
11: (()+0x162b7e) [0x7f2917978b7e]
12: (()+0x163c10) [0x7f2917979c10]
13: (()+0x8182) [0x7f2910e66182]
14: (clone()+0x6d) [0x7f2910b9347d]
- CORE dump from crashed QEMU process have following backtrace:
#0 0x00007fb7f95cbcc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fb7f95cf0d8 in __GI_abort () at abort.c:89
#2 0x00007fb7f7d12535 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fb7f7d106d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fb7f7d10703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fb7f7d10922 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fb800484778 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=160,
func=0x7fb80072ae80 <librbd::LibrbdWriteback::write(object_t const&, object_locator_t const&, unsigned long, unsigned long, SnapContext const&, ceph::buffer::list const&, utime_t, unsigned long, unsigned int, Context*)::__PRETTY_FUNCTION__> "virtual ceph_tid_t librbd::LibrbdWriteback::write(const object_t&, const object_locator_t&, uint64_t, uint64_t, const SnapContext&, const bufferlist&, utime_t, uint64_t, __u32, Context*)") at common/assert.cc:77
#7 0x00007fb8003bb573 in librbd::LibrbdWriteback::write (this=0x7fb805415600, oid=..., oloc=..., off=off@entry=1048576, len=len@entry=4096, snapc=..., bl=..., mtime=...,
trunc_size=trunc_size@entry=0, trunc_seq=trunc_seq@entry=0, oncommit=oncommit@entry=0x7fb7d00aecc0) at librbd/LibrbdWriteback.cc:160
#8 0x00007fb8006bb0ca in ObjectCacher::bh_write (this=this@entry=0x7fb805415ff0, bh=bh@entry=0x7fb7d00312c0) at osdc/ObjectCacher.cc:847
#9 0x00007fb8006c783d in ObjectCacher::_readx (this=0x7fb805415ff0, rd=0x7fb7d0077480, oset=0x7fb805416a80, onfinish=0x7fb7d006db20, external_call=true)
at osdc/ObjectCacher.cc:1108
#10 0x00007fb80038412c in librbd::ImageCtx::aio_read_from_cache (this=this@entry=0x7fb805414740, o=..., object_no=object_no@entry=0, bl=bl@entry=0x7fb7d00013d0,
len=len@entry=4096, off=off@entry=1671168, onfinish=onfinish@entry=0x7fb7d006db20, fadvise_flags=fadvise_flags@entry=0) at librbd/ImageCtx.cc:614
#11 0x00007fb8003a790f in librbd::aio_read (ictx=ictx@entry=0x7fb805414740, image_extents=..., buf=buf@entry=0x7fb80547a600 "\350= ", pbl=pbl@entry=0x0,
c=c@entry=0x7fb80558d690, op_flags=op_flags@entry=0) at librbd/internal.cc:3627
#12 0x00007fb8003a89a3 in librbd::aio_read (ictx=0x7fb805414740, off=1671168, len=4096, buf=0x7fb80547a600 "\350= ", bl=0x0, c=0x7fb80558d690, op_flags=0)
at librbd/internal.cc:3491
#13 0x00007fb80035982a in (anonymous namespace)::C_AioReadWQ::finish (this=<optimized out>, r=<optimized out>) at librbd/librbd.cc:67
#14 0x00007fb800368599 in Context::complete (this=0x7fb8054e0e70, r=<optimized out>) at ./include/Context.h:65
#15 0x00007fb80038484e in ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_process (this=0x7fb805417220, handle=...)
at ./common/WorkQueue.h:191
#16 0x00007fb800474b7e in ThreadPool::worker (this=0x7fb805416c80, wt=0x7fb805416f90) at common/WorkQueue.cc:128
#17 0x00007fb800475c10 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:318
#18 0x00007fb7f9962182 in start_thread (arg=0x7fb7d6ffd700) at pthread_create.c:312
#19 0x00007fb7f968f47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Environment
- Ceph Hammer v0.94.4
- Ceph v9.x
- librbd clients
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.