Chapter 2. Understanding Process Management for Ceph

As a storage administrator, you can manipulate the Ceph daemons in various ways. Manipulating these daemons allows you to start, stop and restart all of the Ceph services as needed.

2.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.

2.2. An Overview of Process Management for Ceph

In Red Hat Ceph Storage 3, all process management is done through the Systemd service. Each time you want to start, restart, and stop the Ceph daemons, you must specify the daemon type or the daemon instance.

Additional Resources

2.3. Starting, Stopping, and Restarting All the Ceph Daemons

To start, stop, or restart all the running Ceph daemons on a node, follow these procedures.

Prerequisites

  • Having root access to the node.

Procedure

  • Starting all Ceph daemons:

    [root@admin ~]# systemctl start ceph.target
  • Stopping all Ceph daemons:

    [root@admin ~]# systemctl stop ceph.target
  • Restarting all Ceph daemons:

    [root@admin ~]# systemctl restart ceph.target

2.4. Starting, Stopping, and Restarting the Ceph Daemons by Type

To start, stop, or restart all Ceph daemons of a particular type, follow these procedures on the node running the Ceph daemons.

Prerequisites

  • Having root access to the node.

Procedure

  • On Ceph Monitor nodes:

    Starting

    [root@mon ~]# systemctl start ceph-mon.target

    Stopping

    [root@mon ~]# systemctl stop ceph-mon.target

    Restarting

    [root@mon ~]# systemctl restart ceph-mon.target

  • On Ceph Manager nodes:

    Starting

    [root@mgr ~]# systemctl start ceph-mgr.target

    Stopping

    [root@mgr ~]# systemctl stop ceph-mgr.target

    Restarting

    [root@mgr ~]# systemctl restart ceph-mgr.target

  • On Ceph OSD nodes:

    Starting

    [root@osd ~]# systemctl start ceph-osd.target

    Stopping

    [root@osd ~]# systemctl stop ceph-osd.target

    Restarting

    [root@osd ~]# systemctl restart ceph-osd.target

  • On Ceph Object Gateway nodes:

    Starting

    [root@rgw ~]# systemctl start ceph-radosgw.target

    Stopping

    [root@rgw ~]# systemctl stop ceph-radosgw.target

    Restarting

    [root@rgw ~]# systemctl restart ceph-radosgw.target

2.5. Starting, Stopping, and Restarting a Ceph Daemons by Instance

To start, stop, or restart a Ceph daemon by instance, follow these procedures on the node running the Ceph daemons.

Prerequisites

  • Having root access to the node.

Procedure

  • On a Ceph Monitor node:

    Starting

    [root@mon ~]# systemctl start ceph-mon@$MONITOR_HOST_NAME

    Stopping

    [root@mon ~]# systemctl stop ceph-mon@$MONITOR_HOST_NAME

    Restarting

    [root@mon ~]# systemctl restart ceph-mon@$MONITOR_HOST_NAME

    Replace

    • $MONITOR_HOST_NAME with the name of the Ceph Monitor node.
  • On a Ceph Manager node:

    Starting

    [root@mgr ~]# systemctl start ceph-mgr@MANAGER_HOST_NAME

    Stopping

    [root@mgr ~]# systemctl stop ceph-mgr@MANAGER_HOST_NAME

    Restarting

    [root@mgr ~]# systemctl restart ceph-mgr@MANAGER_HOST_NAME

    Replace

    • $MANAGER_HOST_NAME with the name of the Ceph Manager node.
  • On a Ceph OSD node:

    Starting

    [root@osd ~]# systemctl start ceph-osd@$OSD_NUMBER

    Stopping

    [root@osd ~]# systemctl stop ceph-osd@$OSD_NUMBER

    Restarting

    [root@osd ~]# systemctl restart ceph-osd@$OSD_NUMBER

    Replace

    • $OSD_NUMBER with the ID number of the Ceph OSD.

      For example, when looking at the ceph osd tree command output, osd.0 has an ID of 0.

  • On a Ceph Object Gateway node:

    Starting

    [root@rgw ~]# systemctl start ceph-radosgw@rgw.$OBJ_GATEWAY_HOST_NAME

    Stopping

    [root@rgw ~]# systemctl stop ceph-radosgw@rgw.$OBJ_GATEWAY_HOST_NAME

    Restarting

    [root@rgw ~]# systemctl restart ceph-radosgw@rgw.$OBJ_GATEWAY_HOST_NAME

    Replace

    • $OBJ_GATEWAY_HOST_NAME with the name of the Ceph Object Gateway node.

2.6. Powering down and rebooting a Red Hat Ceph Storage cluster

Follow the below procedure for powering down and rebooting the Ceph cluster:

Prerequisites

  • Having root access.

Procedure

Powering down the Red Hat Ceph Storage cluster

  1. Stop the clients from using the RBD images, NFS-Ganesha Gateway, and RADOS Gateway on this cluster and any other clients.

    • On the NFS-Ganesha Gateway node:

      # systemctl stop nfs-ganesha.service
    • On the RADOS Gateway node:

      # systemctl stop ceph-radosgw.target
  2. The cluster must be in healthy state (Health_OK and all PGs active+clean) before proceeding. Run ceph status on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.
  3. If you use the Ceph File System (CephFS), the CephFS cluster must be brought down. Taking a CephFS cluster down is done by reducing the number of ranks to 1, setting the cluster_down flag, and then failing the last rank. For example:

    #ceph fs set <fs_name> max_mds 1
    #ceph mds deactivate <fs_name>:1 # rank 2 of 2
    #ceph status # wait for rank 1 to finish stopping
    #ceph fs set <fs_name> cluster_down true
    #ceph mds fail <fs_name>:0

    Setting the cluster_down flag prevents standbys from taking over the failed rank.

  4. Set the noout, norecover, norebalance, nobackfill, nodown and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

    #ceph osd set noout
    #ceph osd set norecover
    #ceph osd set norebalance
    #ceph osd set nobackfill
    #ceph osd set nodown
    #ceph osd set pause
  5. Shut down the OSD nodes one by one:

    [root@osd ~]# systemctl stop ceph-osd.target
  6. Shut down the monitor nodes one by one:

    [root@mon ~]# systemctl stop ceph-mon.target

Rebooting the Red Hat Ceph Storage cluster

  1. Power on the monitor nodes:

    [root@mon ~]# systemctl start ceph-mon.target
  2. Power on the OSD nodes:

    [root@osd ~]# systemctl start ceph-osd.target
  3. Wait for all the nodes to come up. Verify all the services are up and the connectivity is fine between the nodes.
  4. Unset the noout, norecover, norebalance, nobackfill, nodown and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

    #ceph osd unset noout
    #ceph osd unset norecover
    #ceph osd unset norebalance
    #ceph osd unset nobackfill
    #ceph osd unset nodown
    #ceph osd unset pause
  5. If you use the Ceph File System (CephFS), the CephFS cluster must be brought back up by setting the cluster_down flag to false:

    [root@admin~]# ceph fs set <fs_name> cluster_down false
  6. Start the RADOS Gateway and NFS-Ganesha Gateway.

    • On the RADOS Gateway node:

      # systemctl start ceph-radosgw.target
    • On the NFS-Ganesha Gateway node:

      # systemctl start nfs-ganesha.service
  7. Verify the cluster is in healthy state (Health_OK and all PGs active+clean). Run ceph status on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

2.7. Additional Resources