Why does ceph commands take time to complete when one or more monitors are down?

Solution Unverified - Updated -

Issue

  • On busy clusters, when one or more monitors goes down or are not accessible somehow, all ceph commands on the cluster take a bit of time to complete.

  • Timing a 'ceph health' would show the following:

# time ceph health 
2014-09-19 11:09:19.280505 7fdd5c7c6700 0 -- :/1014736 >> AA.BB.CC.DD:6789/0 pipe(0x7fdd58022120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fdd58022390).fault
2014-09-19 11:09:25.280470 7fdd5c5c4700 0 -- AA.BB.CC.DD:0/1014736 >> EE.FF.GG.HH:6789/0 pipe(0x7fdd4c001d20 sd=3 :0 s=1 pg.=0 cs=0 l=1 c=0x7fdd4c001f90).fault
HEALTH_WARN 10 pgs backfilling; 451 pgs peering; 465 pgs stuck inactive; 473 pgs stuck unclean; 41 requests are blocked > 32 sec; recovery 11272/7436241 objects degraded (0.152%); 1 mons down, quorum 0,1,3,4 boxen1,boxen2,boxen3,boxen4
real    0m9.380s
user    0m0.252s
sys 0m0.060s

Environment

  • Inktank Ceph Enterprise 1.2

  • Red Hat Ceph Enterprise 1.2.3

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In