Ceph - QEMU crash when Ceph pool pg_num is changed
Issue
When changing pg_num on a Ceph pool, QEMU VMs using a PG that is modified by the pg_num change operation die a horrible death:
osd/osd_types.cc: In function 'bool pg_t::is_split(unsigned int, unsigned int,
std::set<pg_t>*) const' thread 7fd5e7fff700 time 2015-08-20 15:38:42.272380
osd/osd_types.cc: 459: FAILED assert(m_seed < old_pg_num)
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
1: (()+0x15376b) [0x7fd5fd94576b]
2: (()+0x222f11) [0x7fd5fda14f11]
3: (()+0x222fed) [0x7fd5fda14fed]
4: (()+0xc5379) [0x7fd5fd8b7379]
5: (()+0xdc4bc) [0x7fd5fd8ce4bc]
6: (()+0xdcd0a) [0x7fd5fd8ced0a]
7: (()+0xde272) [0x7fd5fd8d0272]
8: (()+0xe3fef) [0x7fd5fd8d5fef]
9: (()+0x2c3ba9) [0x7fd5fdab5ba9]
10: (()+0x2f15cd) [0x7fd5fdae35cd]
11: (()+0x8182) [0x7fd5f946c182]
12: (clone()+0x6d) [0x7fd5f919947d]
I found a Ceph bug (http://tracker.ceph.com/issues/10399) that is the exact issue and it looks like a fix was just merged to master. This ticket is to express our interest in seeing this quickly backported to Hammer.
Environment
- Red Hat Ceph Storage 1.3
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
