A degraded ceph cluster (on Firefly) stops recovering and gets stuck degraded PGs after an OSD goes down, why?
Issue
-
A degraded ceph cluster (on Firefly) stops recovering and gets stuck degraded PGs after an OSD goes down, why?
-
After removing a failed OSD on a three node Ceph cluster, the data movement/balance started between the existing OSDs, but stalled. This causes the Ceph cluster to get stuck with degraded PGs.
-
The 'osd_pool_default_size' is set to 3 and 'osd_pool_default_min_size' to 2.
-
A 'ceph -s' shows the following:
# ceph -s
cluster 16ce9ce1-aa5f-445f-b994-5699730f364a
health HEALTH_WARN 326 pgs degraded; 366 pgs stuck unclean; recovery 975/83301 objects degraded (1.170%)
monmap e1: 3 mons at {mon-01=172.28.225.72:6789/0,mon-02=172.28.225.73:6789/0,mon-03=172.28.225.74:6789/0}, election epoch 18, quorum 0,1,2 mon-01,mon-02,mon-03
osdmap e540: 29 osds: 29 up, 29 in
pgmap v2727568: 9408 pgs, 19 pools, 135 GB data, 27767 objects
403 GB used, 80437 GB / 80840 GB avail
975/83301 objects degraded (1.170%)
9042 active+clean
326 active+degraded
40 active+remapped
client io 20363 B/s wr, 1 op/s
-
The above is the current state, and there is no more recovery occurring.
-
A 'ceph osd tree' shows:
# ceph osd tree
# id weight type name up/down reweight
-1 81.6 root default
-2 27.2 host node-c01
0 2.72 osd.0 DNE
1 2.72 osd.1 up 1
2 2.72 osd.2 up 1
3 2.72 osd.3 up 1
4 2.72 osd.4 up 1
5 2.72 osd.5 up 1
6 2.72 osd.6 up 1
7 2.72 osd.7 up 1
8 2.72 osd.8 up 1
9 2.72 osd.9 up 1
-3 27.2 host node-02
10 2.72 osd.10 up 1
11 2.72 osd.11 up 1
12 2.72 osd.12 up 1
13 2.72 osd.13 up 1
14 2.72 osd.14 up 1
15 2.72 osd.15 up 1
16 2.72 osd.16 up 1
17 2.72 osd.17 up 1
18 2.72 osd.18 up 1
19 2.72 osd.19 up 1
-4 27.2 host node3-03
20 2.72 osd.20 up 1
21 2.72 osd.21 up 1
22 2.72 osd.22 up 1
23 2.72 osd.23 up 1
24 2.72 osd.24 up 1
25 2.72 osd.25 up 1
26 2.72 osd.26 up 1
27 2.72 osd.27 up 1
28 2.72 osd.28 up 1
29 2.72 osd.29 up 1
Environment
-
Red Hat Ceph Enterprise 1.2.3
-
Inktank Ceph Enterprise 1.2
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.