OSDs keep coring

Solution In Progress - Updated -

Issue

  • OSDs on the machine will not stay up. Sometimes they crash immediately, and sometimes they crash after tens of minutes of runtime. They are crashing hard, leaving behind multigigabyte core files.

  • Multiple rounds of OSD restarts, and another soft reboot of the machine have been ineffective. Starting OSDs with long waits between each OSD is ineffective. They just keep falling over again. Currently 9 out of 36 OSDs on the machine are up.

  • The stack track looks like this:

0> 2015-03-09 15:05:00.482720 7f21a4658700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f21a4658700 time 2015-03-09 15:05:00.481864
common/Thread.cc: 129: FAILED assert(ret == 0)

 ceph version 0.80.8-3-g399e8dc (399e8dcef0142b46b83f702342ca63420118d5b7)
 1: (Thread::create(unsigned long)+0x8a) [0xa673da]
 2: (SimpleMessenger::add_accept_pipe(int)+0x6c) [0xa5ddfc]
 3: (Accepter::entry()+0x218) [0xb16ac8]
 4: (()+0x7e9a) [0x7f21baee1e9a]
 5: (clone()+0x6d) [0x7f21b98994bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

   0> 2015-03-09 15:05:00.620454 7f21a4658700 -1 *** Caught signal (Aborted) **
 in thread 7f21a4658700

 ceph version 0.80.8-3-g399e8dc (399e8dcef0142b46b83f702342ca63420118d5b7)
 1: /usr/bin/ceph-osd() [0x99d83a]
 2: (()+0xfcb0) [0x7f21baee9cb0]
 3: (gsignal()+0x35) [0x7f21b97dd445]
 4: (abort()+0x17b) [0x7f21b97e0bab]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f21ba12b69d]
 6: (()+0xb5846) [0x7f21ba129846]
 7: (()+0xb5873) [0x7f21ba129873]
 8: (()+0xb596e) [0x7f21ba12996e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0xa7fbff]
 10: (Thread::create(unsigned long)+0x8a) [0xa673da]
 11: (SimpleMessenger::add_accept_pipe(int)+0x6c) [0xa5ddfc]
 12: (Accepter::entry()+0x218) [0xb16ac8]
 13: (()+0x7e9a) [0x7f21baee1e9a]
 14: (clone()+0x6d) [0x7f21b98994bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Environment

  • Red Hat Ceph Storage

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content