Ceph: HEALTH_WARN: "BlueFS spillover detected on xxx OSD(s)" after upgrade to RHCS 4

Solution Verified - Updated -

Issue

After upgrading to RHCS 4.x, the Ceph Cluster is reporting a HEALTH_WARN.

osd.xxx spilled over 105 MiB metadata from 'db' device (24 GiB used of 250 GiB) to slow device
osd.yyy spilled over 15  GiB metadata from 'db' device (18 GiB used of 250 GiB) to slow device

Example:

# ceph -s
  cluster:
    id:     de495xxx-Redacted-Cluster-ID-xxx0580fe081
    health: HEALTH_WARN
            BlueFS spillover detected on 314 OSD(s)     <---- This
            1946 large omap objects
            2 pools have too many placement groups

  services:
    mon: 3 daemons, quorum mon04,mon05,mon06 (age 4h)
    mgr: mon06(active, since 4h), standbys: mon05, mon04
    osd: 319 osds: 319 up (since 56m), 319 in
    rgw: 3 daemons active (az1, az2, az3)

  data:
    pools:   14 pools, 2656 pgs
    objects: 730.47M objects, 384 TiB
    usage:   865 TiB used, 954 TiB / 1.8 PiB avail
    pgs:     2656 active+clean


# ceph health detail 
HEALTH_WARN BlueFS spillover detected on 314 OSD(s); 1946 large omap objects; 2 pools have too many placement groups
BLUEFS_SPILLOVER BlueFS spillover detected on 314 OSD(s)     <---- This
     osd.0 spilled over 105 MiB metadata from 'db' device (24 GiB used of 250 GiB) to slow device
     osd.1 spilled over 15 GiB metadata from 'db' device (18 GiB used of 250 GiB) to slow device
     osd.2 spilled over 13 GiB metadata from 'db' device (23 GiB used of 250 GiB) to slow device
.
.
{Not all shown, there are 314 of these in this example}
.
.
     osd.317 spilled over 6.5 GiB metadata from 'db' device (19 GiB used of 250 GiB) to slow device
     osd.318 spilled over 7.3 GiB metadata from 'db' device (23 GiB used of 250 GiB) to slow device
     osd.319 spilled over 8.8 GiB metadata from 'db' device (23 GiB used of 250 GiB) to slow device

LARGE_OMAP_OBJECTS 1946 large omap objects
    3 large objects found in pool '.usage'
    1943 large objects found in pool '.rgw.buckets.index'
    Search the cluster log for 'Large omap object found' for more details.

POOL_TOO_MANY_PGS 2 pools have too many placement groups
    Pool .rgw.buckets.index has 128 placement groups, should have 32
    Pool .rgw.buckets.extra has 128 placement groups, should have 32

Environment

Red Hat Ceph Storage (RHCS) 3.2
Red Hat Ceph Storage (RHCS) 3.3
Red Hat Ceph Storage (RHCS) 4.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content