When changing pg_num on a Ceph pool, QEMU VMs using a PG that is modified by the pg_num change operation die a horrible death:
osd/osd_types.cc: In function 'bool pg_t::is_split(unsigned int, unsigned int, std::set<pg_t>*) const' thread 7fd5e7fff700 time 2015-08-20 15:38:42.272380 osd/osd_types.cc: 459: FAILED assert(m_seed < old_pg_num) ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) 1: (()+0x15376b) [0x7fd5fd94576b] 2: (()+0x222f11) [0x7fd5fda14f11] 3: (()+0x222fed) [0x7fd5fda14fed] 4: (()+0xc5379) [0x7fd5fd8b7379] 5: (()+0xdc4bc) [0x7fd5fd8ce4bc] 6: (()+0xdcd0a) [0x7fd5fd8ced0a] 7: (()+0xde272) [0x7fd5fd8d0272] 8: (()+0xe3fef) [0x7fd5fd8d5fef] 9: (()+0x2c3ba9) [0x7fd5fdab5ba9] 10: (()+0x2f15cd) [0x7fd5fdae35cd] 11: (()+0x8182) [0x7fd5f946c182] 12: (clone()+0x6d) [0x7fd5f919947d]
I found a Ceph bug (http://tracker.ceph.com/issues/10399) that is the exact issue and it looks like a fix was just merged to master. This ticket is to express our interest in seeing this quickly backported to Hammer.
- Red Hat Ceph Storage 1.3
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.