Ceph: Global Recovery Event running despite Health OK

Solution Verified - Updated -

Issue

  • Global Recovery Event running despite Health OK
  • Global Recovery Event never completes
  • Global Recovery Event running endlessly
  • In RHCS 4.x, the Progress Module can negatively affect the Ceph MON Services and in some case cause impact to production.

Example, nothing is wrong, but a Global Recovery Event is running:

$ ceph -s
  cluster:
    id:     bxxxyyy2-bxxd-4yy3-8xxb-fyyyxxxyyye6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,c,d (age 2w)
    mgr: a(active, since 2w)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 2h), 3 in (since 2w)
    rgw: 1 daemon active (1 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   11 pools, 177 pgs
    objects: 822.36k objects, 360 GiB
    usage:   1.1 TiB used, 4.9 TiB / 6 TiB avail
    pgs:     177 active+clean

  io:
    client:   1.7 KiB/s rd, 626 KiB/s wr, 2 op/s rd, 9 op/s wr

  progress:
    Global Recovery Event (18h)
      [=========================...] (remaining: 111m)

Environment

Red Hat Ceph Storage (RHCS) 4.2.x
Red Hat Ceph Storage (RHCS) 4.3.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat OpenShift Container Storage (OCS) 4.x
Red Hat OpenShift Cluster Platform (OCP) 4.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content