Why does ceph commands take time to complete when one or more monitors are down?
Issue
-
On busy clusters, when one or more monitors goes down or are not accessible somehow, all ceph commands on the cluster take a bit of time to complete.
-
Timing a 'ceph health' would show the following:
# time ceph health
2014-09-19 11:09:19.280505 7fdd5c7c6700 0 -- :/1014736 >> AA.BB.CC.DD:6789/0 pipe(0x7fdd58022120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fdd58022390).fault
2014-09-19 11:09:25.280470 7fdd5c5c4700 0 -- AA.BB.CC.DD:0/1014736 >> EE.FF.GG.HH:6789/0 pipe(0x7fdd4c001d20 sd=3 :0 s=1 pg.=0 cs=0 l=1 c=0x7fdd4c001f90).fault
HEALTH_WARN 10 pgs backfilling; 451 pgs peering; 465 pgs stuck inactive; 473 pgs stuck unclean; 41 requests are blocked > 32 sec; recovery 11272/7436241 objects degraded (0.152%); 1 mons down, quorum 0,1,3,4 boxen1,boxen2,boxen3,boxen4
real 0m9.380s
user 0m0.252s
sys 0m0.060s
Environment
-
Inktank Ceph Enterprise 1.2
-
Red Hat Ceph Enterprise 1.2.3
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
