Ceph - After adding new OSDs to a Ceph cluster, it fails to reach a HEALTH_OK state
Issue
- New OSDs were added into an existing Ceph cluster and several of the placement groups failed to re-balance and recover. This lead the cluster to flagging a HEALTH_WARN state and several PGs are stuck in a degraded state.
cluster xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
health HEALTH_WARN
2 pgs degraded
2 pgs stuck degraded
4 pgs stuck unclean
2 pgs stuck undersized
2 pgs undersized
recovery 35/472424 objects degraded (0.007%)
recovery 13/472424 objects misplaced (0.003%)
monmap e3: 3 mons at {mon1=10.1.1.1:6789/0,mon2=10.1.1.2:6789/0,mon3=10.1.1.3:6789/0}
election epoch 26, quorum 0,1,2 mon1,mon2,mon3
osdmap e6577: 214 osds: 214 up, 214 in; 2 remapped pgs
pgmap v8141005: 27712 pgs, 17 pools, 707 GB data, 155 kobjects
2252 GB used, 177 TB / 179 TB avail
35/472424 objects degraded (0.007%)
13/472424 objects misplaced (0.003%)
27708 active+clean
2 active+undersized+degraded
2 active+remapped
client io 6025 B/s rd, 396 kB/s wr, 114 op/s
HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 4 pgs stuck unclean; 2 pgs stuck undersized; 2 pgs undersized; recovery 35/472424 objects degraded (0.007%); recovery 13/472424 objects misplaced (0.003%)
pg 3.2fb is stuck unclean for 1450.234185, current state active+undersized+degraded, last acting [209,40]
pg 1.1a40 is stuck unclean for 9917.354884, current state active+remapped, last acting [152,9,35]
pg 1.18de is stuck unclean for 1454.534147, current state active+remapped, last acting [124,184,52]
pg 2.150 is stuck unclean for 1453.461673, current state active+undersized+degraded, last acting [183,127]
pg 3.2fb is stuck undersized for 667.477688, current state active+undersized+degraded, last acting [209,40]
pg 2.150 is stuck undersized for 1453.436227, current state active+undersized+degraded, last acting [183,127]
pg 3.2fb is stuck degraded for 667.478426, current state active+undersized+degraded, last acting [209,40]
pg 2.150 is stuck degraded for 1453.436964, current state active+undersized+degraded, last acting [183,127]
pg 3.2fb is active+undersized+degraded, acting [209,40]
pg 2.150 is active+undersized+degraded, acting [183,127]
recovery 35/472424 objects degraded (0.007%)
recovery 13/472424 objects misplaced (0.003%)
Environment
- Red Hat Enterprise Linux 7
- Red Hat Ceph Storage 1.3.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
