Keeping Red Hat OpenStack Platform Updated
Performing minor updates of Red Hat OpenStack Platform
Abstract
Chapter 1. Introduction
This document provides a workflow to help keep your Red Hat OpenStack Platform 16.1 environment updated with the latest packages and containers.
This guide provides an upgrade path through the following versions:
Old OpenStack Version | New OpenStack Version |
---|---|
Red Hat OpenStack Platform 16.0 | Red Hat OpenStack Platform 16.1.z |
Red Hat OpenStack Platform 16.1 | Red Hat OpenStack Platform 16.1.z |
1.1. High level workflow
The following table provides an outline of the steps required for the upgrade process:
Step | Description |
---|---|
Updating the undercloud | Update the undercloud to the latest OpenStack Platform 16.1.z version. |
Updating the overcloud | Update the overcloud to the latest OpenStack Platform 16.1.z version. |
Updating the Ceph Storage nodes | Upgrade all Ceph Storage services. |
Finalize the upgrade | Run the convergence command to refresh your overcloud stack. |
1.2. Known issues that might block an update
Review the following known issues that might affect a successful minor version update.
- BZ#1895220 - Network communication problems for instances using OVN provider network
-
Updates to 16.1.3 from an earlier version (16.0.x or 16.1.x) can cause network disruption due to a database issue with Open Virtual Network (OVN). When you perform a minor version update, you normally update Controller nodes before Compute nodes. When you update Controller nodes, director updates the
ovndb-north
database schema to the latest version. Theovn-controller
service on Compute nodes cannot interpret the newer version of theovndb-north
database schema and cannot obtain the correct network flow for instances. As a workaround to minimize the network disruption, you must update theovn_controller
service on Compute nodes before you run theopenstack overcloud update run
command and after you run theopenstack overcloud update prepare
command. For more information, see the "OVN update in 16.1 workaround" knowledgebase article. Red Hat aims to resolve this issue in the next 16.1 minor release update.
Chapter 2. Preparing for a minor update
You must follow some preparation steps on the undercloud and overcloud before you begin the process to update Red Hat OpenStack Platform 16.1 to the latest minor release.
2.1. Locking the environment to a Red Hat Enterprise Linux release
Red Hat OpenStack Platform 16.1 is supported on Red Hat Enterprise Linux 8.2. Prior to performing the update, lock the undercloud and overcloud repositories to the Red Hat Enterprise Linux 8.2 release to avoid upgrading the operating system to a newer minor release.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
-
Edit your overcloud subscription management environment file, which is the file that contains the
RhsmVars
parameter. The default name for this file is usuallyrhsm.yml
. Check your subscription management configuration for the
rhsm_release
parameter. If this parameter is not set, add this parameter and set the parameter to 8.2:parameter_defaults: RhsmVars: … rhsm_username: "myusername" rhsm_password: "p@55w0rd!" rhsm_org_id: "1234567" rhsm_pool_ids: "1a85f9223e3d5e43013e3d6e8ff506fd" rhsm_method: "portal" rhsm_release: "8.2"
- Save the overcloud subscription management environment file.
Create a static inventory file of your overcloud:
$ tripleo-ansible-inventory --ansible_ssh_user heat-admin --static-yaml-inventory ~/inventory.yaml
If you use an overcloud name different to the default overcloud name of
overcloud
, set the name of your overcloud with the--plan
option.Create a playbook that contains a task to lock the operating system version to Red Hat Enterprise Linux 8.2 on all nodes:
$ cat > ~/set_release.yaml <<'EOF' - hosts: all gather_facts: false tasks: - name: set release to 8.2 command: subscription-manager release --set=8.2 become: true EOF
Run the
set_release.yaml
playbook:$ ansible-playbook -i ~/inventory.yaml -f 25 ~/set_release.yaml --limit undercloud,Controller,Compute
Use the
--limit
option to apply the content to all Red Hat OpenStack Platform nodes. Do not run this playbook against Ceph Storage nodes because you are most likely using a different subscription for these nodes.
To manually lock a node to a version, log in to the node and run the subscription-manager release
command:
$ sudo subscription-manager release --set=8.2
2.2. Changing to Extended Update Support (EUS) repositories
Your Red Hat OpenStack Platform subscription includes repositories for Red Hat Enterprise Linux 8.2 Extended Update Support (EUS). The EUS repositories include the latest security patches and bug fixes for Red Hat Enterprise Linux 8.2. Switch to the following repositories before performing an update.
Standard Repository | EUS Resporitory |
---|---|
rhel-8-for-x86_64-baseos-rpms | rhel-8-for-x86_64-baseos-eus-rpms |
rhel-8-for-x86_64-appstream-eus-rpms | rhel-8-for-x86_64-appstream-eus-rpms |
rhel-8-for-x86_64-highavailability-rpms | rhel-8-for-x86_64-highavailability-eus-rpms |
You must use EUS repositories to retain compatibility with a specific version of Podman. Later versions of Podman are untested for this Red Hat OpenStack Platform release and can cause unexpected results.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
-
Edit your overcloud subscription management environment file, which is the file that contains the
RhsmVars
parameter. The default name for this file is usuallyrhsm.yml
. Check the
rhsm_repos
parameter in your subscription management configuration. If this parameter does not include the EUS repositories, change the relevant repositories to the EUS versions:parameter_defaults: RhsmVars: rhsm_repos: - rhel-8-for-x86_64-baseos-eus-rpms - rhel-8-for-x86_64-appstream-eus-rpms - rhel-8-for-x86_64-highavailability-eus-rpms - ansible-2.9-for-rhel-8-x86_64-rpms - advanced-virt-for-rhel-8-x86_64-rpms - openstack-16.1-for-rhel-8-x86_64-rpms - rhceph-4-tools-for-rhel-8-x86_64-rpms - fast-datapath-for-rhel-8-x86_64-rpms
- Save the overcloud subscription management environment file.
Create a static inventory file of your overcloud:
$ tripleo-ansible-inventory --ansible_ssh_user heat-admin --static-yaml-inventory ~/inventory.yaml
If you use an overcloud name different to the default overcloud name of
overcloud
, set the name of your overcloud with the--plan
option.Create a playbook that contains a task to set the repositories to Red Hat Enterprise Linux 8.2 EUS on all nodes:
$ cat > ~/change_eus.yaml <<'EOF' - hosts: all gather_facts: false tasks: - name: change to eus repos command: subscription-manager repos --disable=rhel-8-for-x86_64-baseos-rpms --disable=rhel-8-for-x86_64-appstream-rpms --disable=rhel-8-for-x86_64-highavailability-rpms --enable=rhel-8-for-x86_64-baseos-eus-rpms --enable=rhel-8-for-x86_64-appstream-eus-rpms --enable=rhel-8-for-x86_64-highavailability-eus-rpms become: true EOF
Run the
change_eus.yaml
playbook:$ ansible-playbook -i ~/inventory.yaml -f 25 ~/change_eus.yaml --limit undercloud,Controller,Compute
Use the
--limit
option to apply the content to all Red Hat OpenStack Platform nodes. Do not run this playbook against Ceph Storage nodes because you are most likely using a different subscription for these nodes.
2.3. Updating Red Hat Openstack Platform and Ansible repositories
Update your repositories to use Red Hat OpenStack Platform 16.1 and Ansible 2.9 packages.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
-
Edit your overcloud subscription management environment file, which is the file that contains the
RhsmVars
parameter. The default name for this file is usuallyrhsm.yml
. Check the
rhsm_repos
parameter in your subscription management configuration. If therhsm_repos
parameter is using the Red Hat OpenStack Platform 16.0 and Ansible 2.9 repositories, change the repository to the correct versions:parameter_defaults: RhsmVars: rhsm_repos: - rhel-8-for-x86_64-baseos-eus-rpms - rhel-8-for-x86_64-appstream-eus-rpms - rhel-8-for-x86_64-highavailability-eus-rpms - ansible-2.9-for-rhel-8-x86_64-rpms - advanced-virt-for-rhel-8-x86_64-rpms - openstack-16.1-for-rhel-8-x86_64-rpms - rhceph-4-osd-for-rhel-8-x86_64-rpms - rhceph-4-mon-for-rhel-8-x86_64-rpms - rhceph-4-tools-for-rhel-8-x86_64-rpms - fast-datapath-for-rhel-8-x86_64-rpms
- Save the overcloud subscription management environment file.
Create a static inventory file of your overcloud:
$ tripleo-ansible-inventory --ansible_ssh_user heat-admin --static-yaml-inventory ~/inventory.yaml
If you use an overcloud name different to the default overcloud name of
overcloud
, set the name of your overcloud with the--plan
option.Create a playbook that contains a task to set the repositories to Red Hat OpenStack Platform 16.1 on all nodes:
$ cat > ~/update_rhosp_repos.yaml <<'EOF' - hosts: all gather_facts: false tasks: - name: change osp repos command: subscription-manager repos --disable=openstack-16-for-rhel-8-x86_64-rpms --enable=openstack-16.1-for-rhel-8-x86_64-rpms --disable=ansible-2.8-for-rhel-8-x86_64-rpms --enable=ansible-2.9-for-rhel-8-x86_64-rpms become: true EOF
Run the
update_rhosp_repos.yaml
playbook:$ ansible-playbook -i ~/inventory.yaml -f 25 ~/update_rhosp_repos.yaml --limit undercloud,Controller,Compute
Use the
--limit
option to apply the content to all Red Hat OpenStack Platform nodes. Do not run this playbook against Ceph Storage nodes because you are most likely using a different subscription for these nodes.Create a playbook that contains a task to set the repositories to Red Hat OpenStack Platform 16.1 on all nodes:
$ cat > ~/update_ceph_repos.yaml <<'EOF' - hosts: all gather_facts: false tasks: - name: change ceph repos command: subscription-manager repos --disable=openstack-16-deployment-tools-for-rhel-8-x86_64-rpms --enable=openstack-16.1-deployment-tools-for-rhel-8-x86_64-rpms --disable=ansible-2.8-for-rhel-8-x86_64-rpms --enable=ansible-2.9-for-rhel-8-x86_64-rpms become: true EOF
Run the
update_ceph_repos.yaml
playbook:$ ansible-playbook -i ~/inventory.yaml -f 25 ~/update_rhosp_repos.yaml --limit CephStorage
Use the
--limit
option to apply the content to Ceph Storage nodes.
2.4. Setting the container-tools and virt module versions
Set the container-tools
module to version 2.0
and the virt
module to 8.2
to ensure you use the correct package versions on all nodes.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
Create a static inventory file of your overcloud:
$ tripleo-ansible-inventory --ansible_ssh_user heat-admin --static-yaml-inventory ~/inventory.yaml
If you use an overcloud name different to the default overcloud name of
overcloud
, set the name of your overcloud with the--plan
option.Create a playbook that contains a task to set the
container-tools
module to version2.0
on all nodes:$ cat > ~/container-tools.yaml <<'EOF' - hosts: all gather_facts: false tasks: - name: disable default dnf module for container-tools command: dnf module disable -y container-tools:rhel8 become: true - name: set dnf module for container-tools:2.0 command: dnf module enable -y container-tools:2.0 become: true - hosts: undercloud,Compute,Controller gather_facts: false tasks: - name: disable default dnf module for virt command: dnf module disable -y virt:rhel become: true - name: disable 8.1 dnf module for virt command: dnf module disable -y virt:8.1 become: true - name: set dnf module for virt:8.2 command: dnf module enable -y virt:8.2 become: true EOF
Run the
container-tools.yaml
playbook against all nodes:$ ansible-playbook -i ~/inventory.yaml -f 25 ~/container-tools.yaml
2.5. Updating your container image preparation file
Your container preparation file is the file that contains the ContainerImagePrepare
parameter. You use this file to define the rules for obtaining container images for the undercloud and overcloud. Before you update your environment, check the file to ensure you obtain the correct image versions.
Procedure
-
Edit the container preparation file. The default name for this file is usually
containers-prepare-parameter.yaml
. Check the
tag
parameter is set to16.1
for each rule set:parameter_defaults: ContainerImagePrepare: - push_destination: true set: … tag: '16.1' tag_from_label: '{version}-{release}'
If you do not want to use a specific tag for the update, such as 16.1
or 16.1.2
, remove the tag
key-value pair and specify tag_from_label
only. This will use the installed Red Hat OpenStack Platform version when determining the value for the tag to use as part of the update process.
- Save this file.
2.6. Updating your SSL/TLS configuration
Remove the NodeTLSData
resource from the resource_registry
to update your SSL/TLS configuration.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
stackrc
file:$ source ~/stackrc
-
Edit your custom overcloud SSL/TLS public endpoint file, which is usually named
~/templates/enable-tls.yaml
. Remove the
NodeTLSData
resource from the `resource_registry:resource_registry: OS::TripleO::NodeTLSData: /usr/share/openstack-tripleo-heat-templates/puppet/extraconfig/tls/tls-cert-inject.yaml …
The overcloud deployment uses a new service in HAProxy to determine if SSL/TLS is enabled.
NoteIf this is the only resource in the
resource_registry
section of theenable-tls.yaml
file, remove the completeresource_registry
section.- Save the SSL/TLS public endpoint file file.
Chapter 3. Updating the Undercloud
This process updates the undercloud and its overcloud images to the latest Red Hat OpenStack Platform 16.1 version.
3.1. Performing a minor update of a containerized undercloud
Director provides commands to update the main packages on the undercloud node. This allows you to perform a minor update within the current version of your OpenStack Platform environment.
Procedure
-
Log in to the director as the
stack
user. Run
dnf
to upgrade the director main packages:$ sudo dnf update -y python3-tripleoclient* openstack-tripleo-common openstack-tripleo-heat-templates tripleo-ansible ansible
The director uses the
openstack undercloud upgrade
command to update the undercloud environment. Run the command:$ openstack undercloud upgrade
- Wait until the undercloud upgrade process completes.
Reboot the undercloud to update the operating system’s kernel and other system packages:
$ sudo reboot
- Wait until the node boots.
3.2. Updating the overcloud images
You need to replace your current overcloud images with new versions. The new images ensure the director can introspect and provision your nodes using the latest version of OpenStack Platform software.
Prerequisites
- You have updated the undercloud to the latest version.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Remove any existing images from the
images
directory on thestack
user’s home (/home/stack/images
):$ rm -rf ~/images/*
Extract the archives:
$ cd ~/images $ for i in /usr/share/rhosp-director-images/overcloud-full-latest-16.1.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-16.1.tar; do tar -xvf $i; done $ cd ~
Import the latest images into the director:
$ openstack overcloud image upload --update-existing --image-path /home/stack/images/
Configure your nodes to use the new images:
$ openstack overcloud node configure $(openstack baremetal node list -c UUID -f value)
Verify the existence of the new images:
$ openstack image list $ ls -l /var/lib/ironic/httpboot
When deploying overcloud nodes, ensure the overcloud image version corresponds to the respective heat template version. For example, only use the OpenStack Platform 16.1 images with the OpenStack Platform 16.1 heat templates.
The new overcloud-full
image replaces the old overcloud-full
image. If you made changes to the old image, you must repeat the changes in the new image, especially if you want to deploy new nodes in the future.
3.3. Undercloud Post-Upgrade Notes
-
If using a local set of core templates in your
stack
users home directory, ensure you update the templates using the recommended workflow in Using Customized Core Heat Templates in the Advanced Overcloud Customization guide. You must update the local copy before upgrading the overcloud.
3.4. Next Steps
The undercloud upgrade is complete. You can now update the overcloud.
Chapter 4. Updating the Overcloud
This process updates the overcloud.
Prerequisites
- You have updated the undercloud to the latest version.
4.1. Running the overcloud update preparation
The update requires running openstack overcloud update prepare
command, which performs the following tasks:
- Updates the overcloud plan to OpenStack Platform 16.1
- Prepares the nodes for the update
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update preparation command:
$ openstack overcloud update prepare \ --templates \ --stack STACK_NAME \ -r ROLES_DATA_FILE \ -n NETWORK_DATA_FILE \ -e ENVIRONMENT_FILE \ -e ENVIRONMENT_FILE \ …
Include the following options relevant to your environment:
-
If the name of your overcloud stack is different to the default name
overcloud
, include the--stack
option in the update preparation command and replace<STACK_NAME>
with the name of your stack. -
If using your own custom roles, include your custom roles (
roles_data
) file (-r
) -
If using custom networks, include your composable network (
network_data
) file (-n
) -
Any custom configuration environment files (
-e
)
-
If the name of your overcloud stack is different to the default name
- Wait until the update preparation completes.
4.2. Running the container image preparation
The overcloud requires the latest OpenStack Platform 16.1 container images before performing the update. This involves executing the container_image_prepare
external update process. To execute this process, run the openstack overcloud external-update run
command against tasks tagged with the container_image_prepare
tag. These tasks:
- Automatically prepare all container image configuration relevant to your environment.
- Pull the relevant container images to your undercloud, unless you have previously disabled this option.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the
openstack overcloud external-update run
command against tasks tagged with thecontainer_image_prepare
tag:$ openstack overcloud external-update run --tags container_image_prepare
4.3. Updating all Controller nodes
This process updates all the Controller nodes to the latest OpenStack Platform 16.1 version. The process involves running the openstack overcloud update run
command and including the --limit Controller
option to restrict operations to the Controller nodes only.
Until BZ#1872404 is resolved, for nodes based on composable roles, you must update the Database
role first, before you can update Controller
, Messaging
, Compute
, Ceph
, and other roles.
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK_NAME
option replacing STACK_NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack STACK_NAME --limit Controller --playbook all
- Wait until the Controller node update completes.
4.4. Updating all Compute nodes
This process updates all Compute nodes to the latest OpenStack Platform 16.1 version. The process involves running the openstack overcloud update run
command and including the --limit Compute
option to restrict operations to the Compute nodes only.
- Parallelization considerations
When you update a large number of Compute nodes, to improve performance, you can run the
openstack overcloud update run
command with the--limit Compute
option in parallel on batches of 20 nodes. For example, if you have 80 Compute nodes in your deployment, you can run the following commands to update the Compute nodes in parallel:$ openstack overcloud update run --limit 'Compute[0:19]' > update-compute-0-19.log 2>&1 & $ openstack overcloud update run --limit 'Compute[20:39]' > update-compute-20-39.log 2>&1 & $ openstack overcloud update run --limit 'Compute[40:59]' > update-compute-40-59.log 2>&1 & $ openstack overcloud update run --limit 'Compute[60:79]' > update-compute-60-79.log 2>&1 &
The
'Compute[0:19]'
,'Compute[20:39]'
,'Compute[40:59]'
, and'Compute[60:79]'
way of partitioning the nodes space is random and you don’t have control over which nodes are updated.To update specific Compute nodes, list the nodes that you want to update in a batch separated by a comma:
$ openstack overcloud update run --limit <Compute0>,<Compute1>,<Compute2>,<Compute3>
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK_NAME
option replacing STACK_NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack STACK_NAME --limit Compute --playbook all
- Wait until the Compute node update completes.
4.5. Updating all HCI Compute nodes
This process updates the Hyperconverged Infrastructure (HCI) Compute nodes. The process involves:
-
Running the
openstack overcloud update run
command and including the--nodes ComputeHCI
option to restrict operations to the HCI nodes only. -
Running the
openstack overcloud external-update run --tags ceph
command to perform an update to a containerized Red Hat Ceph Storage 4 cluster.
If you are not using the default stack name (overcloud
), set your stack name with the --stack <stack_name>
option replacing <stack_name>
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack <stack_name> --limit ComputeHCI --playbook all
- Wait until the node update completes.
Run the Ceph Storage update command. For example:
$ openstack overcloud external-update run --stack <stack_name> --tags ceph
- Wait until the Compute HCI node update completes.
4.6. Updating all Ceph Storage nodes
This process updates the Ceph Storage nodes. The process involves:
-
Running the
openstack overcloud update run
command and including the--limit CephStorage
option to restrict operations to the Ceph Storage nodes only. -
Running the
openstack overcloud external-update run
command to runceph-ansible
as an external process and update the Red Hat Ceph Storage 3 containers.
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK_NAME
option replacing STACK_NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update command:
$ openstack overcloud update run --stack _STACK_NAME --limit CephStorage --playbook all
- Wait until the node update completes.
Run the Ceph Storage container update command:
$ openstack overcloud external-update run --tags ceph
- Wait until the Ceph Storage container update completes.
4.7. Performing online database updates
Some overcloud components require an online upgrade (or migration) of their databases tables. This involves executing the online_upgrade
external update process. To execute this process, run the openstack overcloud external-update run
command against tasks tagged with the online_upgrade
tag. This performs online database updates to the following components:
- OpenStack Block Storage (cinder)
- OpenStack Compute (nova)
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the
openstack overcloud external-upgrade run
command against tasks tagged with theonline_upgrade
tag:$ openstack overcloud external-upgrade run --tags online_upgrade
4.8. Finalizing the update
The update requires a final step to update the overcloud stack. This ensures the stack’s resource structure aligns with a regular deployment of OpenStack Platform 16.1 and allows you to perform standard openstack overcloud deploy
functions in the future.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the update finalization command:
$ openstack overcloud update converge \ --templates \ --stack STACK_NAME \ -r ROLES_DATA_FILE \ -n NETWORK_DATA_FILE \ -e ENVIRONMENT_FILE \ -e ENVIRONMENT_FILE \ ... ...
Include the following options relevant to your environment:
-
If the name of your overcloud stack is different to the default name
overcloud
, include the--stack
option in the update preparation command and replace<STACK_NAME>
with the name of your stack. -
If using your own custom roles, include your custom roles (
roles_data
) file (-r
) -
If using custom networks, include your composable network (
network_data
) file (-n
) -
Any custom configuration environment files (
-e
).
-
If the name of your overcloud stack is different to the default name
- Wait until the update finalization completes.
Chapter 5. Rebooting the overcloud
After a minor Red Hat OpenStack version update, reboot your overcloud. The reboot refreshes the nodes with any associated kernel, system-level, and container component updates. These updates may provide performance and security benefits.
Plan downtime to perform the following reboot procedures.
5.1. Rebooting Controller and composable nodes
Complete the following steps to reboot Controller nodes and standalone nodes based on composable roles, excluding Compute nodes and Ceph Storage nodes.
Procedure
- Log in to the node that you want to reboot.
Optional: If the node uses Pacemaker resources, stop the cluster:
[heat-admin@overcloud-controller-0 ~]$ sudo pcs cluster stop
Reboot the node:
[heat-admin@overcloud-controller-0 ~]$ sudo reboot
- Wait until the node boots.
Check the services. For example:
If the node uses Pacemaker services, check that the node has rejoined the cluster:
[heat-admin@overcloud-controller-0 ~]$ sudo pcs status
If the node uses Systemd services, check that all services are enabled:
[heat-admin@overcloud-controller-0 ~]$ sudo systemctl status
If the node uses containerized services, check that all containers on the node are active:
[heat-admin@overcloud-controller-0 ~]$ sudo podman ps
5.2. Rebooting a Ceph Storage (OSD) cluster
Complete the following steps to reboot a cluster of Ceph Storage (OSD) nodes.
Procedure
Log into a Ceph MON or Controller node and disable Ceph Storage cluster rebalancing temporarily:
$ sudo podman exec -it ceph-mon-controller-0 ceph osd set noout $ sudo podman exec -it ceph-mon-controller-0 ceph osd set norebalance
- Select the first Ceph Storage node that you want to reboot and log in to the node.
Reboot the node:
$ sudo reboot
- Wait until the node boots.
Log into the node and check the cluster status:
$ sudo podman exec -it ceph-mon-controller-0 ceph status
Check that the
pgmap
reports allpgs
as normal (active+clean
).- Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.
When complete, log into a Ceph MON or Controller node and re-enable cluster rebalancing:
$ sudo podman exec -it ceph-mon-controller-0 ceph osd unset noout $ sudo podman exec -it ceph-mon-controller-0 ceph osd unset norebalance
Perform a final status check to verify that the cluster reports
HEALTH_OK
:$ sudo podman exec -it ceph-mon-controller-0 ceph status
5.3. Rebooting Compute nodes
Complete the following steps to reboot Compute nodes. To ensure minimal downtime of instances in your Red Hat OpenStack Platform environment, this procedure also includes instructions about migrating instances from the Compute node that you want to reboot. This involves the following workflow:
- Decide whether to migrate instances to another Compute node before rebooting the node.
- Select and disable the Compute node you want to reboot so that it does not provision new instances.
- Migrate the instances to another Compute node.
- Reboot the empty Compute node.
- Enable the empty Compute node.
Prerequisites
Before you reboot the Compute node, you must decide whether to migrate instances to another Compute node while the node is rebooting.
If for some reason you cannot or do not want to migrate the instances, you can set the following core template parameters to control the state of the instances after the Compute node reboots:
NovaResumeGuestsStateOnHostBoot
-
Determines whether to return instances to the same state on the Compute node after reboot. When set to
False
, the instances remain down and you must start them manually. Default value is:False
NovaResumeGuestsShutdownTimeout
-
Number of seconds to wait for an instance to shut down before rebooting. It is not recommended to set this value to
0
. Default value is: 300
For more information about overcloud parameters and their usage, see Overcloud Parameters.
Procedure
-
Log in to the undercloud as the
stack
user. List all Compute nodes and their UUIDs:
$ source ~/stackrc (undercloud) $ openstack server list --name compute
Identify the UUID of the Compute node that you want to reboot.
From the undercloud, select a Compute node. Disable the node:
$ source ~/overcloudrc (overcloud) $ openstack compute service list (overcloud) $ openstack compute service set [hostname] nova-compute --disable
List all instances on the Compute node:
(overcloud) $ openstack server list --host [hostname] --all-projects
- If you decide not to migrate instances, skip to this step.
If you decide to migrate the instances to another Compute node, use one of the following commands:
Migrate the instance to a different host:
(overcloud) $ openstack server migrate [instance-id] --live [target-host]--wait
Let
nova-scheduler
automatically select the target host:(overcloud) $ nova live-migration [instance-id]
Live migrate all instances at once:
$ nova host-evacuate-live [hostname]
NoteThe
nova
command might cause some deprecation warnings, which are safe to ignore.
- Wait until migration completes.
Confirm that the migration was successful:
(overcloud) $ openstack server list --host [hostname] --all-projects
- Continue to migrate instances until none remain on the chosen Compute node.
Log in to the Compute node and reboot the node:
[heat-admin@overcloud-compute-0 ~]$ sudo reboot
- Wait until the node boots.
Re-enable the Compute node:
$ source ~/overcloudrc (overcloud) $ openstack compute service set [hostname] nova-compute --enable
Check that the Compute node is enabled:
(overcloud) $ openstack compute service list