Why does ceph commands take time to complete when one or more monitors are down?

Solution Unverified - Updated -

Issue

  • On busy clusters, when one or more monitors goes down or are not accessible somehow, all ceph commands on the cluster take a bit of time to complete.

  • Timing a 'ceph health' would show the following:

# time ceph health 
2014-09-19 11:09:19.280505 7fdd5c7c6700 0 -- :/1014736 >> AA.BB.CC.DD:6789/0 pipe(0x7fdd58022120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fdd58022390).fault
2014-09-19 11:09:25.280470 7fdd5c5c4700 0 -- AA.BB.CC.DD:0/1014736 >> EE.FF.GG.HH:6789/0 pipe(0x7fdd4c001d20 sd=3 :0 s=1 pg.=0 cs=0 l=1 c=0x7fdd4c001f90).fault
HEALTH_WARN 10 pgs backfilling; 451 pgs peering; 465 pgs stuck inactive; 473 pgs stuck unclean; 41 requests are blocked > 32 sec; recovery 11272/7436241 objects degraded (0.152%); 1 mons down, quorum 0,1,3,4 boxen1,boxen2,boxen3,boxen4
real    0m9.380s
user    0m0.252s
sys 0m0.060s

Environment

  • Inktank Ceph Enterprise 1.2

  • Red Hat Ceph Enterprise 1.2.3

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content