How to handle Inconsistent Placement Groups in Ceph

Environment

Red Hat Ceph Storage (RHCS)

Issue

ceph status or ceph -s reports inconsistent placement groups (PGs)

Resolution

ⓘ Ceph offers the ability to repair inconsistent PGs with the ceph pg repair command.
Do make sure, SMART (Self-Monitoring, Analysis, and Reporting Technology) scans on the devices will normally be required to see if the disks are displaying any bad sectors.

Steps to repair inconsistent PGs:

Watch the ceph log for the result of the scrub.
Raw
```
# ceph -w | grep <pg.id>
```
In another terminal session trigger a deep-scrub on the placement group.
Raw
```
# ceph pg deep-scrub <pg.id>
```
Try repairing it:
Raw
```
# ceph pg repair <pg.id>
```

Sample session:

In the first terminal session, run ceph -w:

# ceph -w | grep 11.eeef

In another terminal, run the deep-scrub:

# ceph pg deep-scrub 11.eeef
instructing pg 11.eeef on osd.106 to deep-scrub

In the terminal session where you run ceph -w you should see error messages similar to:

2015-02-26 01:35:36.778215 osd.106 [ERR] 11.eeef deep-scrub stat mismatch, got 636/635 objects, 0/0 clones, 0/0 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, 1855455/1854371 bytes.
2015-02-26 01:35:36.788334 osd.106 [ERR] 11.eeef deep-scrub 1 errors

Repair:

# ceph pg repair 11.eeef
instructing pg 11.eeef on osd.106 to repair

Watch the terminal session with ceph -w to see something similar to:

2015-02-26 01:49:28.164677 osd.106 [ERR] 11.eeef repair stat mismatch, got 636/635 objects, 0/0 clones, 0/0 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, 1855455/1854371 bytes.
2015-02-26 01:49:28.164957 osd.106 [ERR] 11.eeef repair 1 errors, 1 fixed

Root Cause

The causes of inconstant PGs vary drastically. For a more detailed analysis of what caused this to happen, please open a support case with the Ceph team.

Diagnostic Steps

Run a ceph status and/or ceph health detail and look at the pg states reporting as inconsistent:

$ ceph health detail
[...]
pg 11.eeef is active+clean+inconsistent, acting [106,427,854]
pg 5.ee92 is active+clean+inconsistent, acting [247,183,125]
[...]

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Ansible.com

Red Hat Ecosystem Catalog

Red Hat Hybrid Cloud Console

Red Hat Store

Red Hat Summit and AnsibleFest

How to handle Inconsistent Placement Groups in Ceph

Environment

Issue

Resolution

Sample session:

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Environment

Issue

Resolution

Sample session:

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links