OCS/ODF HEALTH_ERR 1 filesystem is degraded, 1 filesystem is offline , 1 mds daemon damaged - Monitors have assigned me to become a standby

Solution Verified - Updated 2023-08-30T19:34:22+00:00 -

Issue

OCP applications cannot access (read or write) any PV based on cephfs

Ceph status shows no active mds daemon:

$ more ceph_status 
  cluster:
    id:     xxxxe547-xxxx-4b75-976b-xxxxxxxxx
    health: HEALTH_ERR
            1 filesystem is degraded
            1 filesystem is offline
            1 mds daemon damaged

  services:
    mon: 3 daemons, quorum e,m,o (age 65m)
    mgr: a(active, since 83m)
    mds: 0/1 daemons up, 2 standby  <<-----
    osd: 6 osds: 6 up (since 63m), 6 in (since 2d)

  data:
    volumes: 0/1 healthy, 1 recovering; 1 damaged
    pools:   11 pools, 369 pgs
    objects: 27.75k objects, 18 GiB
    usage:   52 GiB used, 24 TiB / 24 TiB avail
    pgs:     369 active+clean


$ ceph health detail 
HEALTH_ERR 1 filesystem is degraded; 1 filesystem is offline; 1 mds daemon damaged
[WRN] FS_DEGRADED: 1 filesystem is degraded
    fs ocs-storagecluster-cephfilesystem is degraded
[ERR] MDS_ALL_DOWN: 1 filesystem is offline
    fs ocs-storagecluster-cephfilesystem is offline because no MDS is active for it.
[ERR] MDS_DAMAGE: 1 mds daemon damaged
    fs ocs-storagecluster-cephfilesystem mds.0 is damaged

$ ceph mds stat 
ocs-storagecluster-cephfilesystem:0/1 2 up:standby, 1 damaged

# ceph fs dump

[mds.ocs-storagecluster-cephfilesystem-b{-1:70220731} state up:standby seq 1 addr [v2:10.131.56.12:6800/1431004327,v1:10.131.56.12:6801/1431004327]]
[mds.ocs-storagecluster-cephfilesystem-a{-1:70254582} state up:standby seq 2 addr [v2:10.131.54.12:6800/1914593486,v1:10.131.54.12:6801/1914593486]]

All OCS relevant pods up and running, even the mds pods:

rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-7f9d485fcpwbf  2/2    Running  0         1h25m  
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-589b69b4x6q6c  2/2    Running  0         1h8m

Environment

OCS/ODF 4.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

OCS/ODF HEALTH_ERR 1 filesystem is degraded, 1 filesystem is offline , 1 mds daemon damaged - Monitors have assigned me to become a standby

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links