Chapter 3. Administering Ceph Clusters That Run in Containers

This chapter describes basic administration tasks to perform on Ceph clusters that run in containers, such as:

3.1. Starting, Stopping, and Restarting Ceph Daemons That Run in Containers

Use the systemctl command start, stop, or restart Ceph daemons that run in containers.

Procedure

  1. To start, stop, or restart a Ceph daemon running in a container, run a systemctl command as root composed in the following format:

    systemctl action ceph-daemon@ID

    Where:

    • action is the action to perform; start, stop, or restart
    • daemon is the daemon; osd, mon, mds, or rgw
    • ID is either

      • The short host name where the ceph-mon, ceph-mds, or ceph-rgw daemons are running
      • The ID of the ceph-osd daemon if it was deployed the osd_scenario parameter set to lvm
      • The device name that the ceph-osd daemon uses if it was deployed with the osd_scenario parameter set to collocated or non-collocated

    For example, to restart a ceph-osd daemon with the ID osd01:

    # systemctl restart ceph-osd@osd01

    To start a ceph-mon demon that runs on the ceph-monitor01 host:

    # systemctl start ceph-mon@ceph-monitor01

    To stop a ceph-rgw daemon that runs on the ceph-rgw01 host:

    # systemctl stop ceph-radosgw@ceph-rgw01
  2. Verify that the action was completed successfully.

    systemctl status ceph-daemon@_ID

    For example:

    # systemctl status ceph-mon@ceph-monitor01

Additional Resources

3.2. Viewing Log Files of Ceph Daemons That Run in Containers

Use the journald daemon from the container host to view a log file of a Ceph daemon from a container.

Procedure

  1. To view the entire Ceph log file, run a journalctl command as root composed in the following format:

    journalctl -u ceph-daemon@ID

    Where:

    • daemon is the Ceph daemon; osd, mon, or rgw
    • ID is either

      • The short host name where the ceph-mon, ceph-mds, or ceph-rgw daemons are running
      • The ID of the ceph-osd daemon if it was deployed the osd_scenario parameter set to lvm
      • The device name that the ceph-osd daemon uses if it was deployed with the osd_scenario parameter set to collocated or non-collocated

    For example, to view the entire log for the ceph-osd daemon with the ID osd01:

    # journalctl -u ceph-osd@osd01
  2. To show only the recent journal entries, use the -f option.

    journalctl -fu ceph-daemon@ID

    For example, to view only recent journal entries for the ceph-mon daemon that runs on the ceph-monitor01 host:

    # journalctl -fu ceph-mon@ceph-monitor01
Note

You can also use the sosreport utility to view the journald logs. For more details about SOS reports, see the What is a sosreport and how to create one in Red Hat Enterprise Linux 4.6 and later? solution on the Red Hat Customer Portal.

Additional Resources

  • The journalctl(1) manual page

3.3. Adding a Ceph OSD using the command-line interface

Here is the high-level workflow for manually adding an OSD to a Red Hat Ceph Storage:

  1. Install the ceph-osd package and create a new OSD instance
  2. Prepare and mount the OSD data and journal drives
  3. Add the new OSD node to the CRUSH map
  4. Update the owner and group permissions
  5. Enable and start the ceph-osd daemon
Important

The ceph-disk command is deprecated. The ceph-volume command is now the preferred method for deploying OSDs from the command-line interface. Currently, the ceph-volume command only supports the lvm plugin. Red Hat will provide examples throughout this guide using both commands as a reference, allowing time for storage administrators to convert any custom scripts that rely on ceph-disk to ceph-volume instead.

See the Red Hat Ceph Storage Administration Guide, for more information on using the ceph-volume command.

Note

For custom storage cluster names, use the --cluster $CLUSTER_NAME option with the ceph and ceph-osd commands.

Prerequisites

Procedure

  1. Enable the Red Hat Ceph Storage 3 OSD software repository.

    Red Hat Enterprise Linux

    [root@osd ~]# subscription-manager repos --enable=rhel-7-server-rhceph-3-osd-rpms

  2. Create the /etc/ceph/ directory:
  3. On the new OSD node, copy the Ceph administration keyring and configuration files from one of the Ceph Monitor nodes:
  4. Install the ceph-osd package on the new Ceph OSD node:

    Red Hat Enterprise Linux

    [root@osd ~]# yum install ceph-osd

  5. Decide if you want to collocate a journal or use a dedicated journal for the new OSDs.

    Note

    The --filestore option is required.

    1. For OSDs with a collocated journal:

      Syntax

      [root@osd ~]# docker exec $CONTAINER_ID ceph-disk --setuser ceph --setgroup ceph prepare  --filestore /dev/$DEVICE_NAME

      Example:

      [root@osd ~]# docker exec ceph-osd-osd1 ceph-disk --setuser ceph --setgroup ceph prepare  --filestore /dev/sda

    2. For OSDs with a dedicated journal:

      Syntax

      [root@osd ~]# docker exec $CONTAINER_ID ceph-disk --setuser ceph --setgroup ceph prepare  --filestore /dev/$DEVICE_NAME /dev/$JOURNAL_DEVICE_NAME

      or

      [root@osd ~]# docker exec $CONTAINER_ID ceph-volume lvm prepare  --filestore --data /dev/$DEVICE_NAME --journal /dev/$JOURNAL_DEVICE_NAME

      Examples

      [root@osd ~]# docker exec ceph-osd-osd1 ceph-disk --setuser ceph --setgroup ceph prepare  --filestore /dev/sda /dev/sdb

      [root@osd ~]# docker exec ceph-osd-osd1 ceph-volume lvm prepare  --filestore --data /dev/vg00/lvol1 --journal /dev/sdb
  6. Set the noup option:

    [root@osd ~]# ceph osd set noup
  7. Activate the new OSD:

    Syntax

    [root@osd ~]# docker exec $CONTAINER_ID ceph-disk activate /dev/$DEVICE_NAME

    or

    [root@osd ~]# docker exec $CONTAINER_ID ceph-volume lvm activate --filestore $OSD_ID $OSD_FSID

    Example

    [root@osd ~]# docker exec ceph-osd-osd1 ceph-disk activate /dev/sda

    [root@osd ~]# docker exec ceph-osd-osd1 ceph-volume lvm activate --filestore 0 6cc43680-4f6e-4feb-92ff-9c7ba204120e
  8. Add the OSD to the CRUSH map:

    Syntax

    ceph osd crush add $OSD_ID $WEIGHT [$BUCKET_TYPE=$BUCKET_NAME ...]

    Example

    [root@osd ~]# ceph osd crush add 4 1 host=node4

    Note

    If you specify more than one bucket, the command places the OSD into the most specific bucket out of those you specified, and it moves the bucket underneath any other buckets you specified.

    Note

    You can also edit the CRUSH map manually. See the Editing a CRUSH map section in the Storage Strategies guide for Red Hat Ceph Storage 3.

    Important

    If you specify only the root bucket, then the OSD attaches directly to the root, but the CRUSH rules expect OSDs to be inside of the host bucket.

  9. Unset the noup option:

    [root@osd ~]# ceph osd unset noup
  10. Update the owner and group permissions for the newly created directories:

    Syntax

    chown -R $OWNER:$GROUP $PATH_TO_DIRECTORY

    Example

    [root@osd ~]# chown -R ceph:ceph /var/lib/ceph/osd
    [root@osd ~]# chown -R ceph:ceph /var/log/ceph
    [root@osd ~]# chown -R ceph:ceph /var/run/ceph
    [root@osd ~]# chown -R ceph:ceph /etc/ceph

  11. If you use clusters with custom names, then add the following line to the appropriate file:

    Red Hat Enterprise Linux

    [root@osd ~]# echo "CLUSTER=$CLUSTER_NAME" >> /etc/sysconfig/ceph

    Replace $CLUSTER_NAME with the custom cluster name.

  12. To ensure that the new OSD is up and ready to receive data, enable and start the OSD service:

    Syntax

    systemctl enable ceph-osd@$OSD_ID
    systemctl start ceph-osd@$OSD_ID

    Example

    [root@osd ~]# systemctl enable ceph-osd@4
    [root@osd ~]# systemctl start ceph-osd@4

3.4. Removing a Ceph OSD using the command-line interface

Removing an OSD from a storage cluster involves updating the cluster map, removing its authentication key, removing the OSD from the OSD map, and removing the OSD from the ceph.conf file. If the node has multiple drives, you might need to remove an OSD for each drive by repeating this procedure.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Enough available OSDs so that the storage cluster is not at its near full ratio.
  • Having root access to the OSD node.

Procedure

  1. Disable and stop the OSD service:

    Syntax

    systemctl disable ceph-osd@$DEVICE_NAME
    systemctl stop ceph-osd@$DEVICE_NAME

    Example

    [root@osd ~]# systemctl disable ceph-osd@sdb
    [root@osd ~]# systemctl stop ceph-osd@sdb

    Once the OSD is stopped, it is down.

  2. Remove the OSD from the storage cluster:

    Syntax

    ceph osd out $DEVICE_NAME

    Example

    [root@osd ~]# ceph osd out sdb

    Important

    Once the OSD is out, Ceph will start rebalancing and copying data to other OSDs in the storage cluster. Red Hat recommends waiting until the storage cluster becomes active+clean before proceeding to the next step. To observe the data migration, run the following command:

    [root@monitor ~]# ceph -w
  3. Remove the OSD from the CRUSH map so that it no longer receives data.

    Syntax

    ceph osd crush remove $OSD_NAME

    Example

    [root@osd ~]# ceph osd crush remove osd.4

    Note

    You can also decompile the CRUSH map, remove the OSD from the device list, remove the device as an item in the host bucket or remove the host bucket. If it is in the CRUSH map and you intend to remove the host, recompile the map and set it. See the Storage Strategies Guide for details.

  4. Remove the OSD authentication key:

    Syntax

    ceph auth del osd.$DEVICE_NAME

    Example

    [root@osd ~]# ceph auth del osd.sdb

  5. Remove the OSD:

    Syntax

    ceph osd rm $DEVICE_NAME

    Example

    [root@osd ~]# ceph osd rm sdb

  6. Edit the storage cluster’s configuration file, by default /etc/ceph.conf, and remove the OSD entry, if it exists:

    Example

    [osd.4]
    host = $HOST_NAME

  7. Remove the reference to the OSD in the /etc/fstab file, if the OSD was added manually.
  8. Copy the updated configuration file to the /etc/ceph/ directory of all other nodes in the storage cluster.

    Syntax

    scp /etc/ceph/$CLUSTER_NAME.conf $USER_NAME@$HOST_NAME:/etc/ceph/

    Example

    [root@osd ~]# scp /etc/ceph/ceph.conf root@node4:/etc/ceph/

3.5. Replacing an OSD drive while retaining the OSD ID

When replacing a failed OSD drive, you can keep the original OSD ID and CRUSH map entry.

Note

The ceph-volume lvm commands defaults to BlueStore for OSDs. To use FileStore OSDs, then use the --filestore, --data and --journal options.

See the Preparing the OSD Data and Journal Drives section for more details.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A failed disk.

Procedure

  1. Destroy the OSD:

    ceph osd destroy $OSD_ID --yes-i-really-mean-it

    Example

    $ ceph osd destroy 1 --yes-i-really-mean-it

  2. Optionally, if the replacement disk was used previously, then you need to zap the disk:

    docker exec $CONTAINER_ID ceph-volume lvm zap $DEVICE

    Example

    $ docker exec ceph-osd-osd1 ceph-volume lvm zap /dev/sdb

  3. Create the new OSD with the existing OSD ID:

    docker exec $CONTAINER_ID ceph-volume lvm create --osd-id $OSD_ID --data $DEVICE

    Example

    $ docker exec ceph-osd-osd1 ceph-volume lvm create --osd-id 1 --data /dev/sdb

3.6. Purging Clusters Deployed by Ansible

If you no longer want to use a Ceph cluster, use the purge-docker-cluster.yml playbook to purge the cluster. Purging a cluster is also useful when the installation process failed and you want to start over.

Warning

After purging a Ceph cluster, all data on the OSDs are lost.

Prerequisites

  • Ensure that the /var/log/ansible.log file is writable.

Procedure

Use the following commands from the Ansible administration node.

  1. As the root user, navigate to the /usr/share/ceph-ansible/ directory.

    [root@admin ~]# cd /usr/share/ceph-ansible
  2. Copy the purge-docker-cluster.yml playbook from the /usr/share/infrastructure-playbooks/ directory to the current directory:

    [root@admin ceph-ansible]# cp infrastructure-playbooks/purge-docker-cluster.yml .
  3. As the Ansible user, use the purge-docker-cluster.yml playbook to purge the Ceph cluster.

    1. To remove all packages, containers, configuration files, and all the data created by the ceph-ansible playbook:

      [user@admin ceph-ansible]$ ansible-playbook purge-docker-cluster.yml
    2. To specify a different inventory file than the default one (/etc/ansible/hosts), use -i parameter:

      ansible-playbook purge-docker-cluster.yml -i inventory-file

      Replace inventory-file with the path to the inventory file.

      For example:

      [user@admin ceph-ansible]$ ansible-playbook purge-docker-cluster.yml -i ~/ansible/hosts
    3. To skip the removal of the Ceph container image, use the --skip-tags=”remove_img” option:

      [user@admin ceph-ansible]$ ansible-playbook --skip-tags="remove_img" purge-docker-cluster.yml
    4. To skip the removal of the packages that were installed during the installation, use the --skip-tags=”with_pkg” option:

      [user@admin ceph-ansible]$ ansible-playbook --skip-tags="with_pkg" purge-docker-cluster.yml