Chapter 7. Changing an OSD Drive

Ceph is designed for fault tolerance, which means Ceph can operate in a degraded state without losing data. For example, Ceph can operate even if a data storage drive fails. In the context of a failed drive, the degraded state means that the extra copies of the data stored on other OSDs will backfill automatically to other OSDs in the cluster. However, if an OSD drive fails, you will have to replace the failed OSD drive and recreate the OSD manually.

When a drive fails, initially the OSD status will be down and in the cluster. Ceph health warnings will indicate that an OSD is down. Just because an OSD gets marked down doesn’t mean the drive has failed. For example, heart beating and other networking issues could get an OSD marked down even if is up.

Modern servers typically deploy with hot-swappable drives so you can pull a failed drive and replace it with a new one without bringing down the node. However, with Ceph Storage you will also have to address software-defined part of the OSD. The general procedure for replacing an OSD involves removing the OSD from your Ceph cluster, replacing the drive and then re-creating the OSD.

  1. Check cluster health.

    # ceph health
  2. If an OSD is down, identify its location in the CRUSH hierarchy.

    # ceph osd tree | grep -i down
  3. If an OSD is down and in, log in to the OSD node and try to restart it.

    # ssh {osd-node}
    # systemctl start ceph-osd@{osd-id}

    If the command indicates that the OSD is already running, it may be a heartbeat or networking issue. If you cannot restart the OSD, the drive may have failed.

    Note

    If the OSD is down, it will eventually get marked out. This is normal behavior for Ceph Storage. When the OSD gets marked out, other OSDs with copies of the failed OSD’s data will begin backfilling to ensure that the required number of copies exist within the cluster. While the cluster is backfilling, the cluster will be in a degraded state.

  4. Check the failed OSD’s mount point.

    If you cannot restart the OSD, you should check the mount point. If the mount point no longer appears, you can try to re-mount the OSD drive and restart the OSD. For example, if the server restarted, but lost the mount point in fstab, remount the drive.

    # df -h

    If you cannot restore the mount point, you may have a failed OSD drive. Use your drive utilities to determine if the drive is healthy. For example:

    # yum install smartmontools
    # smartctl -H /dev/{drive}

    If the drive has failed, you will need to replace it.

  5. Ensure the OSD is out of cluster.

    # ceph osd out osd.<num>
  6. Ensure the OSD process is stopped.

    # systemctl stop ceph-osd@<osd-id>
  7. Ensure the failed OSD is backfilling.

    # ceph -w
  8. Remove the OSD from the CRUSH Map.

    # ceph osd crush remove osd.<num>
  9. Remove the OSD’s authentication keys.

    # ceph auth del osd.<num>
  10. Remove the OSD from the Ceph Cluster.

    # ceph osd rm osd.<num>
  11. Unmount the failed drive path.

    # umount /var/lib/ceph/{daemon}/{cluster}-{daemon-id}
  12. Replace the physical drive. Refer to the documentation for your hardware node. If the drive is hot swappable, simply replace the failed drive with a new drive. If the drive is NOT hot swappable and the node contains multiple OSDs, you MAY need to bring the node down to replace the physical drive. If you need to bring the node down temporarily, you may set the cluster to noout to prevent backfilling.

    ceph osd set noout

    Once you replace the drive and you bring the node and its OSDs back online, remove the noout setting.

    ceph osd unset noout

    Allow the new drive to appear under /dev and make a note of the drive path before proceeding further.

  13. Find the OSD drive and format the disk.
  14. Recreate the OSD. See Adding an OSD for details.
  15. Check your CRUSH hierarchy to ensure it is accurate.

    ceph osd tree

    If you are not satisfied with the location of the OSD in your CRUSH hierarchy, you may move it with the move command.

    ceph osd crush move <bucket-to-move> <bucket-type>=<parent-bucket>
  16. Ensure the OSD is online.