Chapter 4. Upgrading a Red Hat Ceph Storage Cluster
This section describes how to upgrade to a new major or minor version of Red Hat Ceph Storage.
Previously, Red Hat did not provide the ceph-ansible package for Ubuntu. In Red Hat Ceph Storage version 3 and later, you can use the Ansible automation application to upgrade a Ceph cluster from an Ubuntu node.
Please contact Red Hat support prior to upgrading, if you have a large Ceph Object Gateway storage cluster with millions of objects present in buckets.
For more details refer to the Red Hat Ceph Storage 3.0 Release Notes, under the Slow OSD startup after upgrading to Red Hat Ceph Storage 3.0 heading.
Use the Ansible rolling_update.yml playbook located in the /usr/share/ceph-ansible/infrastructure-playbooks/ directory from the administration node to upgrade between two major or minor versions of Red Hat Ceph Storage, or to apply asynchronous updates.
Ansible upgrades the Ceph nodes in the following order:
- Monitor nodes
- MGR nodes
- OSD nodes
- MDS nodes
- Ceph Object Gateway nodes
- All other Ceph client nodes
Red Hat Ceph Storage 3 introduces several changes in Ansible configuration files located in the /usr/share/ceph-ansible/group_vars/ directory; certain parameters were renamed or removed. Therefore, make backup copies of the all.yml and osds.yml files before creating new copies from the all.yml.sample and osds.yml.sample files after upgrading to version 3. For more details about the changes, see Appendix H, Changes in Ansible Variables Between Version 2 and 3.
Red Hat Ceph Storage 3.1 introduces new Ansible playbooks to optimize storage for performance when using Object Gateway and high speed NVMe based SSDs (and SATA SSDs). The playbooks do this by placing journals and bucket indexes together on SSDs, which can increase performance compared to having all journals on one device. These playbooks are designed to be used when installing Ceph. Existing OSDs continue to work and need no extra steps during an upgrade. There is no way to upgrade a Ceph cluster while simultaneously reconfiguring OSDs to optimize storage in this way. To use different devices for journals or bucket indexes requires reprovisioning OSDs. For more information see Using NVMe with LVM optimally in Ceph Object Gateway for Production.
Do the following changes before upgrading between Red Hat Ceph Storage 2.4 and 3:
-
Update the
/etc/ansible/hostsfile with customosd_scenariosif there are any. -
In
group_vars/all.yml, setgenerate_fsidtofalse. -
Get the current cluster
fsidby executing theceph fsidcommand, and set the retrievedfsidingroup_vars/all.yml.
The rolling_update.yml playbook includes the serial variable that adjusts the number of nodes to be updated simultaneously. Red Hat strongly recommends to use the default value (1), which ensures that Ansible will upgrade cluster nodes one by one.
When using the rolling_update.yml playbook to upgrade to Red Hat Ceph Storage 3.0 and from version 3.0 to other zStream releases of 3.0, users who use the Ceph File System (CephFS) must manually update the Metadata Server (MDS) cluster. This is due to a known issue.
Comment the MDS hosts in /etc/ansible/hosts before upgrading the entire cluster using ceph-ansible rolling-upgrade.yml, and then upgrade MDS manually. In the /etc/ansible/hosts file:
#[mdss] #host-abc
For more details about this known issue, including how to update the MDS cluster, refer to the Red Hat Ceph Storage 3.0 Release Notes.
Prerequisites
- If the Ceph nodes are not connected to the Red Hat Content Delivery Network (CDN) and you used an ISO image to install Red Hat Ceph Storage, update the local repository with the latest version of Red Hat Ceph Storage. See Section 2.4, “Enabling the Red Hat Ceph Storage Repositories” for details.
If upgrading from Red Hat Ceph Storage 2.x to 3.x, on the Ansible administration node and the RBD mirroring node, enable the Red Hat Ceph Storage 3 Tools repository and Ansible repository:
[root@admin ~]$ sudo bash -c 'umask 0077; echo deb https://customername:customerpasswd@rhcs.download.redhat.com/3-updates/Tools $(lsb_release -sc) main | tee /etc/apt/sources.list.d/Tools.list' [root@admin ~]$ sudo bash -c 'wget -O - https://www.redhat.com/security/fd431d51.txt | apt-key add -' [root@admin ~]$ sudo apt-get update
If upgrading from Red Hat Ceph Storage 3.0 to 3.1 and using Red Hat Ceph Storage Dashboard, before upgrading the cluster, purge the old cephmetrics installation from the cluster. This avoids an issue where the dashboard won’t display data after upgrade.
If the
cephmetrics-ansiblepackage isn’t already updated, update it:[root@admin ~]# yum update cephmetrics-ansible
Change to the /usr/share/cephmetrics-ansible/ directory.
[root@admin ~]# cd /usr/share/cephmetrics-ansible
Purge the existing cephmetrics installation.
[root@admin cephmetrics-ansible]# ansible-playbook -v purge.yml
Install the updated Red Hat Ceph Storage Dashboard
[root@admin cephmetrics-ansible]# ansible-playbook -v playbook.yml
- If the Ceph nodes are not connected to the Red Hat Content Delivery Network (CDN) and you used an ISO image to install Red Hat Ceph Storage, update the local repository with the latest version of Red Hat Ceph Storage. See Section 2.4, “Enabling the Red Hat Ceph Storage Repositories” for details.
If upgrading from RHCS 2.x to 3.x, or from RHCS 3.x to the latest version, on the Ansible administration node, ensure the latest version of the
ceph-ansiblepackage is installed.[root@admin ~]$ sudo apt-get install ceph-ansible
In the
rolling_update.ymlplaybook, change thehealth_osd_check_retriesandhealth_osd_check_delayvalues to40and30respectively.health_osd_check_retries: 40 health_osd_check_delay: 30
For each OSD node, Ansible will wait up to 20 minutes. Also, Ansible will check the cluster health every 30 seconds, waiting before continuing the upgrade process.
If the cluster you want to upgrade contains Ceph Block Device images that use the
exclusive-lockfeature, ensure that all Ceph Block Device users have permissions to blacklist clients:ceph auth caps client.<ID> mon 'allow r, allow command "osd blacklist"' osd '<existing-OSD-user-capabilities>'
Procedure
Use the following commands from the Ansible administration node.
Navigate to the
/usr/share/ceph-ansible/directory:[user@admin ~]$ cd /usr/share/ceph-ansible/
Back up the
group_vars/all.ymlandgroup_vars/osds.ymlfiles. Skip this step when upgrading from version 3.x to the latest version.[root@admin ceph-ansible]# cp group_vars/all.yml group_vars/all_old.yml [root@admin ceph-ansible]# cp group_vars/osds.yml group_vars/osds_old.yml
Create new copies of the
group_vars/all.yml.sampleandgroup_vars/osds.yml.samplenamedgroup_vars/all.ymlandgroup_vars/osds.ymlrespectively and edit them according to you deployment. Skip this step when upgrading from version 3.x to the latest version. For details, see Appendix H, Changes in Ansible Variables Between Version 2 and 3 and Section 3.2, “Installing a Red Hat Ceph Storage Cluster” .[root@admin ceph-ansible]# cp group_vars/all.yml.sample group_vars/all.yml [root@admin ceph-ansible]# cp group_vars/osds.yml.sample group_vars/osds.yml
In the
group_vars/all.ymlfile, uncomment theupgrade_ceph_packagesoption and set it toTrue.upgrade_ceph_packages: True
Add the
fetch_directoryparameter to thegroup_vars/all.ymlfile.fetch_directory: <full_directory_path>
Replace:
-
<full_directory_path>with a writable location, such as the Ansible user’s home directory.
-
If the cluster you want to upgrade contains any Ceph Object Gateway nodes, add the
radosgw_interfaceparameter to thegroup_vars/all.ymlfile.radosgw_interface: <interface>
Replace:
-
<interface>with the interface that the Ceph Object Gateway nodes listen to.
-
In the Ansible inventory file located at
/etc/ansible/hosts, add the Ceph Manager (ceph-mgr) nodes under the[mgrs]section. Colocate the Ceph Manager daemon with Monitor nodes. Skip this step when upgrading from version 3.x to the latest version.[mgrs] <monitor-host-name> <monitor-host-name> <monitor-host-name>
Copy
rolling_update.ymlfrom theinfrastructure-playbooksdirectory to the current directory.[root@admin ceph-ansible]# cp infrastructure-playbooks/rolling_update.yml .
Create the
/var/log/ansible/directory and assign the appropriate permissions for theansibleuser:[root@admin ceph-ansible]# mkdir /var/log/ansible [root@admin ceph-ansible]# chown ansible:ansible /var/log/ansible [root@admin ceph-ansible]# chmod 755 /var/log/ansible
Edit the
/usr/share/ceph-ansible/ansible.cfgfile, updating thelog_pathvalue as follows:log_path = /var/log/ansible/ansible.log
Run the playbook:
[user@admin ceph-ansible]$ ansible-playbook rolling_update.yml
To use the playbook only for a particular group of nodes on the Ansible inventory file, use the
--limitoption. For details, see Section 3.7, “Understanding thelimitoption”.From the RBD mirroring daemon node, upgrade
rbd-mirrormanually:$ sudo apt-get upgrade rbd-mirror
Restart the daemon:
# systemctl restart ceph-rbd-mirror@<client-id>
Verify that the cluster health is OK.
[root@monitor ~]# ceph -s

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.