Chapter 18. Upgrading a standard overcloud
This scenario contains an example upgrade process for a standard overcloud environment, which includes the following node types:
- Three Controller nodes
- Three Ceph Storage nodes
- Multiple Compute nodes
18.1. Running the overcloud upgrade preparation
The upgrade requires running openstack overcloud upgrade prepare
command, which performs the following tasks:
- Updates the overcloud plan to OpenStack Platform 16.2
- Prepares the nodes for the upgrade
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK NAME
option replacing STACK NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Run the upgrade preparation command:
$ openstack overcloud upgrade prepare \ --stack STACK NAME \ --templates \ -e ENVIRONMENT FILE … -e /home/stack/templates/upgrades-environment.yaml \ -e /home/stack/templates/rhsm.yaml \ -e /home/stack/containers-prepare-parameter.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovs.yaml \ …
Include the following options relevant to your environment:
-
The environment file (
upgrades-environment.yaml
) with the upgrade-specific parameters (-e
). -
The environment file (
rhsm.yaml
) with the registration and subscription parameters (-e
). -
The environment file (
containers-prepare-parameter.yaml
) with your new container image locations (-e
). In most cases, this is the same environment file that the undercloud uses. -
The environment file (
neutron-ovs.yaml
) to maintain OVS compatibility. -
Any custom configuration environment files (
-e
) relevant to your deployment. -
If applicable, your custom roles (
roles_data
) file using--roles-file
. -
If applicable, your composable network (
network_data
) file using--networks-file
. -
If you use a custom stack name, pass the name with the
--stack
option.
-
The environment file (
- Wait until the upgrade preparation completes.
Download the container images:
$ openstack overcloud external-upgrade run --stack STACK NAME --tags container_image_prepare
18.2. Upgrading Controller nodes
To upgrade all the Controller nodes to Red Hat OpenStack Platform (RHOSP) 16.2, you must upgrade each Controller node starting with the bootstrap Controller node.
If your deployment uses a Red Hat Ceph Storage cluster that was deployed using director, follow the procedure in Upgrading Controller nodes with director-deployed Ceph Storage.
During the bootstrap Controller node upgrade process, a new Pacemaker cluster is created and new RHOSP 16.2 containers are started on the node, while the remaining Controller nodes are still running on RHOSP 13.
After upgrading the bootstrap node, you must upgrade each additional node with Pacemaker services and ensure that each node joins the new Pacemaker cluster started with the bootstrap node. For more information, see Overcloud node upgrade workflow.
Procedure
Source the
stackrc
file:$ source ~/stackrc
On the undercloud node, identify the bootstrap Controller node:
$ tripleo-ansible-inventory --list [--stack <stack_name>] |jq .overcloud_Controller.hosts[0]
-
Replace
<stack_name>
with the name of your stack.
-
Replace
Upgrade the bootstrap Controller node:
Perform a Leapp upgrade of the operating system on the bootstrap Controller node:
$ openstack overcloud upgrade run [--stack <stack>] --tags system_upgrade --limit <bootstrap_controller_node>
-
Replace
<bootstrap_controller_node>
with the host name of the bootstrap Controller node in your environment, for example,overcloud-controller-0
. If you are not using the default overcloud stack name,
overcloud
, include the--stack
optional argument and replace<stack>
with the name of your overcloud stack.The bootstrap Controller node is rebooted as part of the Leapp upgrade.
-
Replace
Copy the latest version of the database from an existing node to the bootstrap node:
$ openstack overcloud external-upgrade run [--stack <stack>] --tags system_upgrade_transfer_data
ImportantThis command causes an outage on the control plane. You cannot perform any standard operations on the overcloud until the RHOSP upgrade is complete and the control plane is active again.
Launch temporary 16.2 containers on Compute nodes to help facilitate workload migration when you upgrade Compute nodes at a later step:
$ openstack overcloud upgrade run --stack <stack> --playbook upgrade_steps_playbook.yaml --tags nova_hybrid_state --limit all
Upgrade the overcloud with no tags:
$ openstack overcloud upgrade run --stack <stack> --limit <bootstrap_controller_node>
Verify that after the upgrade, the new Pacemaker cluster is started and that the control plane services such as
galera
,rabbit
,haproxy
, andredis
are running:$ sudo pcs status
Upgrade the next Controller node:
Verify that the old cluster is no longer running:
$ sudo pcs status
An error similar to the following is displayed when the cluster is not running:
Error: cluster is not currently running on this node
Perform a Leapp upgrade of the operating system on the Controller node:
$ openstack overcloud upgrade run --stack <stack> --tags system_upgrade --limit <controller_node>
Replace
<controller_node>
with the host name of the Controller node to upgrade, for example,overcloud-controller-1
.The Controller node is rebooted as a part of the Leapp upgrade.
Upgrade the Controller node, adding it to the previously upgraded nodes in the new Pacemaker cluster:
$ openstack overcloud upgrade run --stack <stack> --limit <bootstrap_controller_node,controller_node_1,controller_node_n>
-
Replace
<bootstrap_controller_node,controller_node_1,controller_node_n>
with a comma-separated list of the Controller nodes that you have upgraded so far, and the additional Controller node that you want to add to the Pacemaker cluster, for example,overcloud-controller-0,overcloud-controller-1, overcloud-controller-2
.
-
Replace
18.3. Upgrading Controller nodes with director-deployed Ceph Storage
If your deployment uses a Red Hat Ceph Storage cluster that was deployed using director, you must complete this procedure.
To upgrade all the Controller nodes to OpenStack Platform 16.2, you must upgrade each Controller node starting with the bootstrap Controller node.
During the bootstrap Controller node upgrade process, a new Pacemaker cluster is created and new Red Hat OpenStack 16.2 containers are started on the node, while the remaining Controller nodes are still running on Red Hat OpenStack 13.
After upgrading the bootstrap node, you must upgrade each additional node with Pacemaker services and ensure that each node joins the new Pacemaker cluster started with the bootstrap node. For more information, see Overcloud node upgrade workflow.
In this example, the controller nodes are named using the default overcloud-controller-NODEID
convention. This includes the following three controller nodes:
-
overcloud-controller-0
-
overcloud-controller-1
-
overcloud-controller-2
Substitute these values for your own node names where applicable.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Identify the bootstrap Controller node by running the following command on the undercloud node:
$ tripleo-ansible-inventory --list [--stack <stack_name>] |jq .overcloud_Controller.hosts[0]
-
Optional: Replace
<stack_name>
with the name of the stack. If not specified, the default isovercloud
.
-
Optional: Replace
Upgrade the bootstrap Controller node:
Run the external upgrade command with the
ceph_systemd
tag:$ openstack overcloud external-upgrade run [--stack <stack_name>] --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-0
Replace
<stack_name>
with the name of your stack.This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected Controller node using the
ceph_ansible_limit
variable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leapp
upgrade.Run the upgrade command with the
system_upgrade
tag:$ openstack overcloud upgrade run [--stack <stack_name>] --tags system_upgrade --limit overcloud-controller-0
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
Performs a reboot as a part of the Leapp upgrade.
ImportantThe next command causes an outage on the control plane. You cannot perform any standard operations on the overcloud during the next few steps.
Run the external upgrade command with the
system_upgrade_transfer_data
tag:$ openstack overcloud external-upgrade run [--stack <stack_name>] --tags system_upgrade_transfer_data
This command copies the latest version of the database from an existing node to the bootstrap node.
Run the upgrade command with the
nova_hybrid_state
tag and run only theupgrade_steps_playbook.yaml
playbook:$ openstack overcloud upgrade run [--stack <stack_name>] --playbook upgrade_steps_playbook.yaml --tags nova_hybrid_state --limit all
This command launches temporary 16.2 containers on Compute nodes to help facilitate workload migration when you upgrade Compute nodes at a later step.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run [--stack <stack_name>] --limit overcloud-controller-0
This command performs the Red Hat OpenStack Platform upgrade.
ImportantThe control plane becomes active when this command finishes. You can perform standard operations on the overcloud again.
Verify that after the upgrade, the new Pacemaker cluster is started and that the control plane services such as galera, rabbit, haproxy, and redis are running:
$ sudo pcs status
Upgrade the next Controller node:
Verify that the old cluster is no longer running:
$ sudo pcs status
An error similar to the following is displayed when the cluster is not running:
Error: cluster is not currently running on this node
Run the external upgrade command with the
ceph_systemd
tag:$ openstack overcloud external-upgrade run [--stack <stack_name>] --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-1
This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected Controller node using the
ceph_ansible_limit
variable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leapp
upgrade.Run the upgrade command with the
system_upgrade
tag on the next Controller node:$ openstack overcloud upgrade run [--stack <stack_name>] --tags system_upgrade --limit overcloud-controller-1
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run [--stack <stack_name>] --limit overcloud-controller-0,overcloud-controller-1
This command performs the Red Hat OpenStack Platform upgrade. In addition to this node, include the previously upgraded bootstrap node in the
--limit
option.
Upgrade the final Controller node:
Verify that the old cluster is no longer running:
$ sudo pcs status
An error similar to the following is displayed when the cluster is not running:
Error: cluster is not currently running on this node
Run the external upgrade command with the
ceph_systemd
tag:$ openstack overcloud external-upgrade run [--stack <stack_name>] --tags ceph_systemd -e ceph_ansible_limit=overcloud-controller-2
This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected Controller node using the
ceph_ansible_limit
variable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leapp
upgrade.Run the upgrade command with the
system_upgrade
tag:$ openstack overcloud upgrade run [--stack <stack_name>] --tags system_upgrade --limit overcloud-controller-2
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run [--stack <stack_name>] --limit overcloud-controller-0,overcloud-controller-1,overcloud-controller-2
This command performs the Red Hat OpenStack Platform upgrade. Include all Controller nodes in the
--limit
option.
18.4. Upgrading the operating system for Ceph Storage nodes
If your deployment uses a Red Hat Ceph Storage cluster that was deployed using director, you must upgrade the operating system for each Ceph Storage nodes.
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK NAME
option replacing STACK NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Select a Ceph Storage node and upgrade the operating system:
Run the external upgrade command with the
ceph_systemd
tag:$ openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-cephstorage-0
This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected node using the
ceph_ansible_limit
variable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leapp
upgrade.Run the upgrade command with the
system_upgrade
tag:$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-cephstorage-0
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-cephstorage-0
This command runs the
config-download
playbooks and configures the composable services on the Ceph Storage node. This step does not upgrade the Ceph Storage nodes to Red Hat Ceph Storage 4. The Red Hat Ceph Storage 4 upgrade occurs in a later procedure.
Select the next Ceph Storage node and upgrade the operating system:
Run the external upgrade command with the
ceph_systemd
tag:$ openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-cephstorage-1
This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected node using the
ceph_ansible_limit
variable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leapp
upgrade.Run the upgrade command with the
system_upgrade
tag:$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-cephstorage-1
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-cephstorage-1
This command runs the
config-download
playbooks and configures the composable services on the Ceph Storage node. This step does not upgrade the Ceph Storage nodes to Red Hat Ceph Storage 4. The Red Hat Ceph Storage 4 upgrade occurs in a later procedure.
Select the final Ceph Storage node and upgrade the operating system:
Run the external upgrade command with the
ceph_systemd
tag:$ openstack overcloud external-upgrade run --stack STACK NAME --tags ceph_systemd -e ceph_ansible_limit=overcloud-cephstorage-2
This command performs the following functions:
- Changes the systemd units that control the Ceph Storage containers to use Podman management.
-
Limits actions to the selected node using the
ceph_ansible_limit
variable.
This step is a preliminary measure to prepare the Ceph Storage services for The
leapp
upgrade.Run the upgrade command with the
system_upgrade
tag:$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-cephstorage-2
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-cephstorage-2
This command runs the
config-download
playbooks and configures the composable services on the Ceph Storage node. This step does not upgrade the Ceph Storage nodes to Red Hat Ceph Storage 4. The Red Hat Ceph Storage 4 upgrade occurs in a later procedure.
18.5. Upgrading Compute nodes
Upgrade all the Compute nodes to OpenStack Platform 16.2.
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK NAME
option replacing STACK NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
- Migrate your instances. For more information on migration strategies, see Migrating virtual machines between Compute nodes.
Run the upgrade command with the
system_upgrade
tag:$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-compute-0
This command performs the following actions:
- Performs a Leapp upgrade of the operating system.
- Performs a reboot as a part of the Leapp upgrade.
Run the upgrade command with no tags:
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-compute-0
This command performs the Red Hat OpenStack Platform upgrade.
To upgrade multiple Compute nodes in parallel, set the
--limit
option to a comma-separated list of nodes that you want to upgrade. First perform thesystem_upgrade
task:$ openstack overcloud upgrade run --stack STACK NAME --tags system_upgrade --limit overcloud-compute-0,overcloud-compute-1,overcloud-compute-2
Then perform the standard OpenStack service upgrade:
$ openstack overcloud upgrade run --stack STACK NAME --limit overcloud-compute-0,overcloud-compute-1,overcloud-compute-2
18.6. Synchronizing the overcloud stack
The upgrade requires an update the overcloud stack to ensure that the stack resource structure and parameters align with a fresh deployment of OpenStack Platform 16.2.
If you are not using the default stack name (overcloud
), set your stack name with the --stack STACK NAME
option replacing STACK NAME
with the name of your stack.
Procedure
Source the
stackrc
file:$ source ~/stackrc
Edit the
containers-prepare-parameter.yaml
file and remove the following parameters and their values:-
ceph3_namespace
-
ceph3_tag
-
ceph3_image
-
name_prefix_stein
-
name_suffix_stein
-
namespace_stein
-
tag_stein
-
-
To re-enable fencing in your overcloud, set the
EnableFencing
parameter totrue
in thefencing.yaml
environment file. Run the upgrade finalization command:
$ openstack overcloud upgrade converge \ --stack STACK NAME \ --templates \ -e ENVIRONMENT FILE … -e /home/stack/templates/upgrades-environment.yaml \ -e /home/stack/templates/rhsm.yaml \ -e /home/stack/containers-prepare-parameter.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovs.yaml \ …
Include the following options relevant to your environment:
-
The environment file (
upgrades-environment.yaml
) with the upgrade-specific parameters (-e
). -
The environment file (
fencing.yaml
) with theEnableFencing
parameter set totrue
. -
The environment file (
rhsm.yaml
) with the registration and subscription parameters (-e
). -
The environment file (
containers-prepare-parameter.yaml
) with your new container image locations (-e
). In most cases, this is the same environment file that the undercloud uses. -
The environment file (
neutron-ovs.yaml
) to maintain OVS compatibility. -
Any custom configuration environment files (
-e
) relevant to your deployment. -
If applicable, your custom roles (
roles_data
) file using--roles-file
. -
If applicable, your composable network (
network_data
) file using--networks-file
. -
If you use a custom stack name, pass the name with the
--stack
option.
-
The environment file (
- Wait until the stack synchronization completes.
You do not need the upgrades-environment.yaml
file for any further deployment operations.