Chapter 4. Troubleshooting Ceph Monitors

This chapter contains information on how to fix the most common errors related to the Ceph Monitors.

4.1. Prerequisites

  • Verify the network connection.

4.2. Most common Ceph Monitor errors

The following tables list the most common error messages that are returned by the ceph health detail command, or included in the Ceph logs. The tables provide links to corresponding sections that explain the errors and point to specific procedures to fix the problems.

4.2.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.

4.2.2. Ceph Monitor error messages

A table of common Ceph Monitor error messages, and a potential fix.

Error messageSee

HEALTH_WARN

mon.X is down (out of quorum)

Ceph Monitor is out of quorum

clock skew

Clock skew

store is getting too big!

The Ceph Monitor store is getting too big

4.2.3. Common Ceph Monitor error messages in the Ceph logs

A table of common Ceph Monitor error messages found in the Ceph logs, and a link to a potential fix.

Error messageLog fileSee

clock skew

Main cluster log

Clock skew

clocks not synchronized

Main cluster log

Clock skew

Corruption: error in middle of record

Monitor log

Ceph Monitor is out of quorum

Recovering the Ceph Monitor store

Corruption: 1 missing files

Monitor log

Ceph Monitor is out of quorum

Recovering the Ceph Monitor store

Caught signal (Bus error)

Monitor log

Ceph Monitor is out of quorum

4.2.4. Ceph Monitor is out of quorum

One or more Ceph Monitors are marked as down but the other Ceph Monitors are still able to form a quorum. In addition, the ceph health detail command returns an error message similar to the following one:

HEALTH_WARN 1 mons down, quorum 1,2 mon.b,mon.c
mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum)

What This Means

Ceph marks a Ceph Monitor as down due to various reasons.

If the ceph-mon daemon is not running, it might have a corrupted store or some other error is preventing the daemon from starting. Also, the /var/ partition might be full. As a consequence, ceph-mon is not able to perform any operations to the store located by default at /var/lib/ceph/mon-SHORT_HOST_NAME/store.db and terminates.

If the ceph-mon daemon is running but the Ceph Monitor is out of quorum and marked as down, the cause of the problem depends on the Ceph Monitor state:

  • If the Ceph Monitor is in the probing state longer than expected, it cannot find the other Ceph Monitors. This problem can be caused by networking issues, or the Ceph Monitor can have an outdated Ceph Monitor map (monmap) and be trying to reach the other Ceph Monitors on incorrect IP addresses. Alternatively, if the monmap is up-to-date, Ceph Monitor’s clock might not be synchronized.
  • If the Ceph Monitor is in the electing state longer than expected, the Ceph Monitor’s clock might not be synchronized.
  • If the Ceph Monitor changes its state from synchronizing to electing and back, the cluster state is advancing. This means that it is generating new maps faster than the synchronization process can handle.
  • If the Ceph Monitor marks itself as the leader or a peon, then it believes to be in a quorum, while the remaining cluster is sure that it is not. This problem can be caused by failed clock synchronization.

To Troubleshoot This Problem

  1. Verify that the ceph-mon daemon is running. If not, start it:

    Syntax

    systemctl status ceph-FSID@DAEMON_NAME
    systemctl start ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl status ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service
    [root@mon ~]# systemctl start ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

  2. If you are not able to start ceph-mon, follow the steps in The ceph-mon daemon cannot start.
  3. If you are able to start the ceph-mon daemon but is marked as down, follow the steps in The ceph-mon daemon is running, but marked as `down`.

The ceph-mon Daemon Cannot Start

  1. Check the corresponding Ceph Monitor log located at /var/log/ceph/CLUSTER_FSID/ceph-mon.HOST_NAME.log by default.

    Note

    By default, the monitor logs are not present in the log folder. You need to enable logging to files for the logs to appear in the folder. See the Ceph daemon logs to enable logging to files.

  2. If the log contains error messages similar to the following ones, the Ceph Monitor might have a corrupted store.

    Corruption: error in middle of record
    Corruption: 1 missing files; example: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb

    To fix this problem, replace the Ceph Monitor. See Replacing a failed monitor.

  3. If the log contains an error message similar to the following one, the /var/ partition might be full. Delete any unnecessary data from /var/.

    Caught signal (Bus error)
    Important

    Do not delete any data from the Monitor directory manually. Instead, use the ceph-monstore-tool to compact it. See Compacting the Ceph Monitor store for details.

  4. If you see any other error messages, open a support ticket. See Contacting Red Hat Support for service for details.

The ceph-mon Daemon Is Running, but Still Marked as down

  1. From the Ceph Monitor host that is out of the quorum, use the mon_status command to check its state:

    [root@mon ~]# ceph daemon ID mon_status

    Replace ID with the ID of the Ceph Monitor, for example:

    [ceph: root@host01 /]# ceph daemon mon.host01 mon_status
  2. If the status is probing, verify the locations of the other Ceph Monitors in the mon_status output.

    1. If the addresses are incorrect, the Ceph Monitor has incorrect Ceph Monitor map (monmap). To fix this problem, see Injecting a Ceph Monitor map.
    2. If the addresses are correct, verify that the Ceph Monitor clocks are synchronized. See Clock skew for details. In addition, troubleshoot any networking issues, see Troubleshooting Networking issues for details.
  3. If the status is electing, verify that the Ceph Monitor clocks are synchronized. See Clock skew for details.
  4. If the status changes from electing to synchronizing, open a support ticket. See Contacting Red Hat Support for service for details.
  5. If the Ceph Monitor is the leader or a peon, verify that the Ceph Monitor clocks are synchronized. See Clock skew for details. Open a support ticket if synchronizing the clocks does not solve the problem. See Contacting Red Hat Support for service for details.

Additional Resources

4.2.5. Clock skew

A Ceph Monitor is out of quorum, and the ceph health detail command output contains error messages similar to these:

mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum)
mon.a addr 127.0.0.1:6789/0 clock skew 0.08235s > max 0.05s (latency 0.0045s)

In addition, Ceph logs contain error messages similar to these:

2022-05-04 07:28:32.035795 7f806062e700 0 log [WRN] : mon.a 127.0.0.1:6789/0 clock skew 0.14s > max 0.05s
2022-05-04 04:31:25.773235 7f4997663700 0 log [WRN] : message from mon.1 was stamped 0.186257s in the future, clocks not synchronized

What This Means

The clock skew error message indicates that Ceph Monitors' clocks are not synchronized. Clock synchronization is important because Ceph Monitors depend on time precision and behave unpredictably if their clocks are not synchronized.

The mon_clock_drift_allowed parameter determines what disparity between the clocks is tolerated. By default, this parameter is set to 0.05 seconds.

Important

Do not change the default value of mon_clock_drift_allowed without previous testing. Changing this value might affect the stability of the Ceph Monitors and the Ceph Storage Cluster in general.

Possible causes of the clock skew error include network problems or problems with chrony Network Time Protocol (NTP) synchronization if that is configured. In addition, time synchronization does not work properly on Ceph Monitors deployed on virtual machines.

To Troubleshoot This Problem

  1. Verify that your network works correctly. For details, see Troubleshooting networking issues. If you use chrony for NTP, see Basic chrony NTP troubleshooting section for more information.
  2. If you use a remote NTP server, consider deploying your own chrony NTP server on your network. For details, see the Using the Chrony Suite to Configure NTP chapter in the Configuring basic system settings for Red Hat Enterprise Linux 8.
Note

Ceph evaluates time synchronization every five minutes only so there will be a delay between fixing the problem and clearing the clock skew messages.

4.2.6. The Ceph Monitor store is getting too big

The ceph health command returns an error message similar to the following one:

mon.ceph1 store is getting too big! 48031 MB >= 15360 MB -- 62% avail

What This Means

Ceph Monitors store is in fact a RocksDB database that stores entries as key–values pairs. The database includes a cluster map and is located by default at /var/lib/ceph/CLUSTER_FSID/mon.HOST_NAME/store.db.

Querying a large Monitor store can take time. As a consequence, the Ceph Monitor can be delayed in responding to client queries.

In addition, if the /var/ partition is full, the Ceph Monitor cannot perform any write operations to the store and terminates. See Ceph Monitor is out of quorum for details on troubleshooting this issue.

To Troubleshoot This Problem

  1. Check the size of the database:

    Syntax

    du -sch /var/lib/ceph/CLUSTER_FSID/mon.HOST_NAME/store.db/

    Specify the name of the cluster and the short host name of the host where the ceph-mon is running.

    Example

    [root@mon ~]# du -sh  /var/lib/ceph/b341e254-b165-11ed-a564-ac1f6bb26e8c/mon.host01/
    109M	/var/lib/ceph/b341e254-b165-11ed-a564-ac1f6bb26e8c/mon.host01/
    47G     /var/lib/ceph/mon/ceph-ceph1/store.db/
    47G     total

  2. Compact the Ceph Monitor store. For details, see Compacting the Ceph Monitor Store.

Additional Resources

4.2.7. Understanding Ceph Monitor status

The mon_status command returns information about a Ceph Monitor, such as:

  • State
  • Rank
  • Elections epoch
  • Monitor map (monmap)

If Ceph Monitors are able to form a quorum, use mon_status with the ceph command-line utility.

If Ceph Monitors are not able to form a quorum, but the ceph-mon daemon is running, use the administration socket to execute mon_status.

An example output of mon_status

{
    "name": "mon.3",
    "rank": 2,
    "state": "peon",
    "election_epoch": 96,
    "quorum": [
        1,
        2
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 1,
        "fsid": "d5552d32-9d1d-436c-8db1-ab5fc2c63cd0",
        "modified": "0.000000",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "mon.1",
                "addr": "172.25.1.10:6789\/0"
            },
            {
                "rank": 1,
                "name": "mon.2",
                "addr": "172.25.1.12:6789\/0"
            },
            {
                "rank": 2,
                "name": "mon.3",
                "addr": "172.25.1.13:6789\/0"
            }
        ]
    }
}

Ceph Monitor States

Leader
During the electing phase, Ceph Monitors are electing a leader. The leader is the Ceph Monitor with the highest rank, that is the rank with the lowest value. In the example above, the leader is mon.1.
Peon
Peons are the Ceph Monitors in the quorum that are not leaders. If the leader fails, the peon with the highest rank becomes a new leader.
Probing
A Ceph Monitor is in the probing state if it is looking for other Ceph Monitors. For example, after you start the Ceph Monitors, they are probing until they find enough Ceph Monitors specified in the Ceph Monitor map (monmap) to form a quorum.
Electing
A Ceph Monitor is in the electing state if it is in the process of electing the leader. Usually, this status changes quickly.
Synchronizing
A Ceph Monitor is in the synchronizing state if it is synchronizing with the other Ceph Monitors to join the quorum. The smaller the Ceph Monitor store it, the faster the synchronization process. Therefore, if you have a large store, synchronization takes a longer time.

Additional Resources

4.2.8. Additional Resources

4.3. Injecting a monmap

If a Ceph Monitor has an outdated or corrupted Ceph Monitor map (monmap), it cannot join a quorum because it is trying to reach the other Ceph Monitors on incorrect IP addresses.

The safest way to fix this problem is to obtain and inject the actual Ceph Monitor map from other Ceph Monitors.

Note

This action overwrites the existing Ceph Monitor map kept by the Ceph Monitor.

This procedure shows how to inject the Ceph Monitor map when the other Ceph Monitors are able to form a quorum, or when at least one Ceph Monitor has a correct Ceph Monitor map. If all Ceph Monitors have corrupted store and therefore also the Ceph Monitor map, see Recovering the Ceph Monitor store.

Prerequisites

  • Access to the Ceph Monitor Map.
  • Root-level access to the Ceph Monitor node.

Procedure

  1. If the remaining Ceph Monitors are able to form a quorum, get the Ceph Monitor map by using the ceph mon getmap command:

    Example

    [ceph: root@host01 /]# ceph mon getmap -o /tmp/monmap

  2. If the remaining Ceph Monitors are not able to form the quorum and you have at least one Ceph Monitor with a correct Ceph Monitor map, copy it from that Ceph Monitor:

    1. Stop the Ceph Monitor which you want to copy the Ceph Monitor map from:

      Syntax

      systemctl stop ceph-FSID@DAEMON_NAME

      Example

      [root@mon ~]# systemctl stop ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

    2. Copy the Ceph Monitor map:

      Syntax

      ceph-mon -i ID --extract-monmap /tmp/monmap

      Replace ID with the ID of the Ceph Monitor which you want to copy the Ceph Monitor map from:

      Example

      [ceph: root@host01 /]#  ceph-mon -i mon.a  --extract-monmap /tmp/monmap

  3. Stop the Ceph Monitor with the corrupted or outdated Ceph Monitor map:

    Syntax

    systemctl stop ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl stop ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

  4. Inject the Ceph Monitor map:

    Syntax

    ceph-mon -i ID --inject-monmap /tmp/monmap

    Replace ID with the ID of the Ceph Monitor with the corrupted or outdated Ceph Monitor map:

    Example

    [root@mon ~]# ceph-mon -i mon.host01 --inject-monmap /tmp/monmap

  5. Start the Ceph Monitor:

    Syntax

    systemctl start ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl start ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

    If you copied the Ceph Monitor map from another Ceph Monitor, start that Ceph Monitor, too:

    Syntax

    systemctl start ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl start ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

4.4. Replacing a failed Monitor

When a Ceph Monitor has a corrupted store, you can replace the monitor in the storage cluster.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Able to form a quorum.
  • Root-level access to Ceph Monitor node.

Procedure

  1. From the Monitor host, remove the Monitor store by default located at /var/lib/ceph/mon/CLUSTER_NAME-SHORT_HOST_NAME:

    rm -rf /var/lib/ceph/mon/CLUSTER_NAME-SHORT_HOST_NAME

    Specify the short host name of the Monitor host and the cluster name. For example, to remove the Monitor store of a Monitor running on host1 from a cluster called remote:

    [root@mon ~]# rm -rf /var/lib/ceph/mon/remote-host1
  2. Remove the Monitor from the Monitor map (monmap):

    ceph mon remove SHORT_HOST_NAME --cluster CLUSTER_NAME

    Specify the short host name of the Monitor host and the cluster name. For example, to remove the Monitor running on host1 from a cluster called remote:

    [ceph: root@host01 /]# ceph mon remove host01 --cluster remote
  3. Troubleshoot and fix any problems related to the underlying file system or hardware of the Monitor host.

Additional Resources

4.5. Compacting the monitor store

When the Monitor store has grown big in size, you can compact it:

  • Dynamically by using the ceph tell command.
  • Upon the start of the ceph-mon daemon.
  • By using the ceph-monstore-tool when the ceph-mon daemon is not running. Use this method when the previously mentioned methods fail to compact the Monitor store or when the Monitor is out of quorum and its log contains the Caught signal (Bus error) error message.
Important

Monitor store size changes when the cluster is not in the active+clean state or during the rebalancing process. For this reason, compact the Monitor store when rebalancing is completed. Also, ensure that the placement groups are in the active+clean state.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the Ceph Monitor node.

Procedure

  1. To compact the Monitor store when the ceph-mon daemon is running:

    Syntax

    ceph tell mon.HOST_NAME compact

  2. Replace HOST_NAME with the short host name of the host where the ceph-mon is running. Use the hostname -s command when unsure.

    Example

    [ceph: root@host01 /]# ceph tell mon.host01 compact

  3. Add the following parameter to the Ceph configuration under the [mon] section:

    [mon]
    mon_compact_on_start = true
  4. Restart the ceph-mon daemon:

    Syntax

    systemctl restart ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl restart ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

  5. Ensure that Monitors have formed a quorum:

    [ceph: root@host01 /]# ceph mon stat
  6. Repeat these steps on other Monitors if needed.

    Note

    Before you start, ensure that you have the ceph-test package installed.

  7. Verify that the ceph-mon daemon with the large store is not running. Stop the daemon if needed.

    Syntax

    systemctl status ceph-FSID@DAEMON_NAME
    systemctl stop ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl status ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service
    [root@mon ~]# systemctl stop ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

  8. Compact the Monitor store:

    Syntax

    ceph-monstore-tool /var/lib/ceph/CLUSTER_FSID/mon.HOST_NAME compact

    Replace HOST_NAME with a short host name of the Monitor host.

    Example

    [ceph: root@host01 /]# ceph-monstore-tool /var/lib/ceph/b404c440-9e4c-11ec-a28a-001a4a0001df/mon.host01 compact

  9. Start ceph-mon again:

    Syntax

    systemctl start ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl start ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

4.6. Opening port for Ceph manager

The ceph-mgr daemons receive placement group information from OSDs on the same range of ports as the ceph-osd daemons. If these ports are not open, a cluster will devolve from HEALTH_OK to HEALTH_WARN and will indicate that PGs are unknown with a percentage count of the PGs unknown.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to Ceph Manager.

Procedure

  1. To resolve this situation, for each host running ceph-mgr daemons, open ports 6800-7300.

    Example

    [root@ceph-mgr] # firewall-cmd --add-port 6800-7300/tcp
    [root@ceph-mgr] # firewall-cmd --add-port 6800-7300/tcp --permanent

  2. Restart the ceph-mgr daemons.

4.7. Recovering the Ceph Monitor store

Ceph Monitors store the cluster map in a key-value store such as RocksDB. If the store is corrupted on a Monitor, the Monitor terminates unexpectedly and fails to start again. The Ceph logs might include the following errors:

Corruption: error in middle of record
Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb

The Red Hat Ceph Storage clusters use at least three Ceph Monitors so that if one fails, it can be replaced with another one. However, under certain circumstances, all Ceph Monitors can have corrupted stores. For example, when the Ceph Monitor nodes have incorrectly configured disk or file system settings, a power outage can corrupt the underlying file system.

If there is corruption on all Ceph Monitors, you can recover it with information stored on the OSD nodes by using utilities called ceph-monstore-tool and ceph-objectstore-tool.

Important

These procedures cannot recover the following information:

  • Metadata Daemon Server (MDS) keyrings and maps
  • Placement Group settings:

    • full ratio set by using the ceph pg set_full_ratio command
    • nearfull ratio set by using the ceph pg set_nearfull_ratio command
Important

Never restore the Ceph Monitor store from an old backup. Rebuild the Ceph Monitor store from the current cluster state using the following steps and restore from that.

4.7.1. Recovering the Ceph Monitor store when using BlueStore

Follow this procedure if the Ceph Monitor store is corrupted on all Ceph Monitors and you use the BlueStore back end.

In containerized environments, this method requires attaching Ceph repositories and restoring to a non-containerized Ceph Monitor first.

Warning

This procedure can cause data loss. If you are unsure about any step in this procedure, contact the Red Hat Technical Support for assistance with the recovering process.

Prerequisites

  • All OSDs containers are stopped.
  • Enable Ceph repositories on the Ceph nodes based on their roles.
  • The ceph-test and rsync packages are installed on the OSD and Monitor nodes.
  • The ceph-mon package is installed on the Monitor nodes.
  • The ceph-osd package is installed on the OSD nodes.

Procedure

  1. Mount all disks with Ceph data to a temporary location. Repeat this step for all OSD nodes.

    1. List the data partitions using the ceph-volume command:

      Example

      [ceph: root@host01 /]# ceph-volume lvm list

    2. Mount the data partitions to a temporary location:

      Syntax

      mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-$i

    3. Restore the SELinux context:

      Syntax

      for i in {OSD_ID}; do restorecon /var/lib/ceph/osd/ceph-$i; done

      Replace OSD_ID with a numeric, space-separated list of Ceph OSD IDs on the OSD node.

    4. Change the owner and group to ceph:ceph:

      Syntax

      for i in {OSD_ID}; do chown -R ceph:ceph /var/lib/ceph/osd/ceph-$i; done

      Replace OSD_ID with a numeric, space-separated list of Ceph OSD IDs on the OSD node.

      Important

      Due to a bug that causes the update-mon-db command to use additional db and db.slow directories for the Monitor database, you must also copy these directories. To do so:

      1. Prepare a temporary location outside the container to mount and access the OSD database and extract the OSD maps needed to restore the Ceph Monitor:

        Syntax

        ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev OSD-DATA --path /var/lib/ceph/osd/ceph-OSD-ID

        Replace OSD-DATA with the Volume Group (VG) or Logical Volume (LV) path to the OSD data and OSD-ID with the ID of the OSD.

      2. Create a symbolic link between the BlueStore database and block.db:

        Syntax

        ln -snf BLUESTORE DATABASE /var/lib/ceph/osd/ceph-OSD-ID/block.db

        Replace BLUESTORE-DATABASE with the Volume Group (VG) or Logical Volume (LV) path to the BlueStore database and OSD-ID with the ID of the OSD.

  2. Use the following commands from the Ceph Monitor node with the corrupted store. Repeat them for all OSDs on all nodes.

    1. Collect the cluster map from all OSD nodes:

      Example

      [root@host01 ~]# cd /root/
      [root@host01 ~]# ms=/tmp/monstore/
      [root@host01 ~]# db=/root/db/
      [root@host01 ~]# db_slow=/root/db.slow/
      
      [root@host01 ~]# mkdir $ms
      [root@host01 ~]# for host in $osd_nodes; do
                      echo "$host"
                      rsync -avz $ms $host:$ms
                      rsync -avz $db $host:$db
                      rsync -avz $db_slow $host:$db_slow
      
                      rm -rf $ms
                      rm -rf $db
                      rm -rf $db_slow
      
                      sh -t $host <<EOF
                        for osd in /var/lib/ceph/osd/ceph-*; do
                          ceph-objectstore-tool --type bluestore --data-path \$osd --op update-mon-db --mon-store-path $ms
      
                         done
                      EOF
      
                            rsync -avz $host:$ms $ms
                            rsync -avz $host:$db $db
                            rsync -avz $host:$db_slow $db_slow
                      done

    2. Set the appropriate capabilities:

      Example

      [ceph: root@host01 /]# ceph-authtool /etc/ceph/ceph.client.admin.keyring -n mon. --cap mon 'allow *' --gen-key
      [ceph: root@host01 /]# cat /etc/ceph/ceph.client.admin.keyring
        [mon.]
          key = AQCleqldWqm5IhAAgZQbEzoShkZV42RiQVffnA==
          caps mon = "allow *"
        [client.admin]
          key = AQCmAKld8J05KxAArOWeRAw63gAwwZO5o75ZNQ==
          auid = 0
          caps mds = "allow *"
          caps mgr = "allow *"
          caps mon = "allow *"
          caps osd = "allow *"

    3. Move all sst file from the db and db.slow directories to the temporary location:

      Example

      [ceph: root@host01 /]# mv /root/db/*.sst /root/db.slow/*.sst /tmp/monstore/store.db

    4. Rebuild the Monitor store from the collected map:

      Example

      [ceph: root@host01 /]# ceph-monstore-tool /tmp/monstore rebuild -- --keyring /etc/ceph/ceph.client.admin

      Note

      After using this command, only keyrings extracted from the OSDs and the keyring specified on the ceph-monstore-tool command line are present in Ceph’s authentication database. You have to recreate or import all other keyrings, such as clients, Ceph Manager, Ceph Object Gateway, and others, so those clients can access the cluster.

    5. Back up the corrupted store. Repeat this step for all Ceph Monitor nodes:

      Syntax

      mv /var/lib/ceph/mon/ceph-HOSTNAME/store.db /var/lib/ceph/mon/ceph-HOSTNAME/store.db.corrupted

      Replace HOSTNAME with the host name of the Ceph Monitor node.

    6. Replace the corrupted store. Repeat this step for all Ceph Monitor nodes:

      Syntax

      scp -r /tmp/monstore/store.db HOSTNAME:/var/lib/ceph/mon/ceph-HOSTNAME/

      Replace HOSTNAME with the host name of the Monitor node.

    7. Change the owner of the new store. Repeat this step for all Ceph Monitor nodes:

      Syntax

      chown -R ceph:ceph /var/lib/ceph/mon/ceph-HOSTNAME/store.db

      Replace HOSTNAME with the host name of the Ceph Monitor node.

  3. Unmount all the temporary mounted OSDs on all nodes:

    Example

    [root@host01 ~]# umount /var/lib/ceph/osd/ceph-*

  4. Start all the Ceph Monitor daemons:

    Syntax

    systemctl start ceph-FSID@DAEMON_NAME

    Example

    [root@mon ~]# systemctl start ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@mon.host01.service

  5. Ensure that the Monitors are able to form a quorum:

    Syntax

    ceph -s

    Replace HOSTNAME with the host name of the Ceph Monitor node.

  6. Import the Ceph Manager keyring and start all Ceph Manager processes:

    Syntax

    ceph auth import -i /etc/ceph/ceph.mgr.HOSTNAME.keyring
    systemctl start ceph-FSID@DAEMON_NAME

    Example

    [root@host01 ~]# systemctl start ceph-b341e254-b165-11ed-a564-ac1f6bb26e8c@mgr.extensa003.exrqql.service

    Replace HOSTNAME with the host name of the Ceph Manager node.

  7. Start all OSD processes across all OSD nodes. Repeat for all OSDs on the cluster:

    Syntax

    systemctl start ceph-FSID@osd.OSD_ID

    Example

    [root@host01 ~]# systemctl start ceph-b404c440-9e4c-11ec-a28a-001a4a0001df@osd.0.service

  8. Ensure that the OSDs are returning to service:

    Example

    [ceph: root@host01 /]# ceph -s

Additional Resources

4.8. Additional Resources