Chapter 3. Director-Based Environments: Performing Upgrades to Major Versions

Warning

Before performing an upgrade to the latest major version, ensure the undercloud and overcloud are updated to the latest minor versions. This includes both OpenStack Platform services and the base operating system. For the process on performing a minor version update, see "Director-Based Environments: Performing Updates to Minor Versions" in the Red Hat OpenStack Platform 9 Upgrading Red Hat OpenStack Platform guide. Performing a major version upgrade without first performing a minor version update can cause failures in the upgrade process.

Warning

With High Availaibility for Compute instances (or Instance HA, as described in High Availability for Compute Instances), upgrades or scale-up operations are not possible. Any attempts to do so will fail.

If you have Instance HA enabled, disable it before performing an upgrade or scale-up. To do so, perform a rollback as described in Rollback.

This chapter explores how to upgrade your environment. This includes upgrading aspects of both the Undercloud and Overcloud. This upgrade process provides a means for you to move to the next major version. In this case, it is a upgrade from Red Hat OpenStack Platform 9 to Red Hat OpenStack Platform 10.

This procedure for both situations involves the following workflow:

  1. Upgrade the Red Hat OpenStack Platform director packages
  2. Upgrade the Overcloud images on the Red Hat OpenStack Platform director
  3. Upgrade the Overcloud stack and its packages using the Red Hat OpenStack Platform director

3.1. Upgrade Support Statement

A successful upgrade process requires some preparation to accommodate changes from one major version to the next. Read the following support statement to help with Red Hat OpenStack Platform upgrade planning.

Upgrades in Red Hat OpenStack Platform director require full testing with specific configurations before performed on any live production environment. Red Hat has tested most use cases and combinations offered as standard options through the director. However, due to the number of possible combinations, this is never a fully exhaustive list. In addition, if the configuration has been modified from the standard deployment, either manually or through post configuration hooks, testing upgrade features in a non-production environment is critical. Therefore, we advise you to:

  • Perform a backup of your Undercloud node before starting any steps in the upgrade procedure. See the Back Up and Restore the Director Undercloud guide for backup procedures.
  • Run the upgrade procedure with your customizations in a test environment before running the procedure in your production environment.
  • If you feel uncomfortable about performing this upgrade, contact Red Hat’s support team and request guidance and assistance on the upgrade process before proceeding.

The upgrade process outlined in this section only accommodates customizations through the director. If you customized an Overcloud feature outside of director then:

  • Disable the feature
  • Upgrade the Overcloud
  • Re-enable the feature after the upgrade completes

This means the customized feature is unavailable until the completion of the entire upgrade.

Red Hat OpenStack Platform director 10 can manage previous Overcloud versions of Red Hat OpenStack Platform. See the support matrix below for information.

Table 3.1. Support Matrix for Red Hat OpenStack Platform director 10

Version

Overcloud Updating

Overcloud Deploying

Overcloud Scaling

Red Hat OpenStack Platform 10

Red Hat OpenStack Platform 9 and 10

Red Hat OpenStack Platform 9 and 10

Red Hat OpenStack Platform 9 and 10

If managing an older Overcloud version, use the following Heat template collections: * For Red Hat OpenStack Platform 9: /usr/share/openstack-tripleo-heat-templates/mitaka/

For example:

$ openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates/mitaka/ [OTHER_OPTIONS]

The following are some general upgrade tips:

  • After each step, run the pcs status command on the Controller node cluster to ensure no resources have failed.
  • Please contact Red Hat and request guidance and assistance on the upgrade process before proceeding if you feel uncomfortable about performing this upgrade.

3.2. Important Pre-Upgrade Notes

  • Red Hat OpenStack Platform 10 uses some new kernel parameters now available in Red Hat Enterprise Linux 7.3. Make sure that you have upgraded your undercloud and overcloud to Red Hat Enterprise Linux 7.3 and Open vSwitch 2.5. See "Director-Based Environments: Performing Updates to Minor Versions" for instructions on performing a package update to your undercloud and overcloud. When you have updated the kernel to the latest version, perform a reboot so that the new kernel parameters take effect.
  • The OpenStack Platform 10 upgrade procedure migrates to a new composable architecture. This means many services that Pacemaker managed in previous versions now use systemd management. This results in a reduced number of Pacemaker managed resources.
  • In previous versions of Red Hat OpenStack Platform, OpenStack Telemetry (ceilometer) used its database for metrics storage. Red Hat OpenStack Platform 10 uses OpenStack Telemetry Metrics (gnocchi) as a default backend for OpenStack Telemetry. If using an external Ceph Storage cluster for the metrics data, create a new pool on your external Ceph Storage cluster before upgrading to Red Hat OpenStack 10. The name for this pool is set with the GnocchiRbdPoolName parameter and the default pool name is metrics. If you use the CephPools parameter to customize your list of pools, add the metrics pool to the list. Note that there is no data migration plan for the metrics data. For more information, see Section 3.5.4, “OpenStack Telemetry Metrics”.
  • Combination alarms in OpenStack Telemetry Alarms (aodh) are deprecated in favor of composite alarms. Note that:

    • Aodh does not expose combination alarms by default.
    • A new parameter, EnableCombinationAlarms, enables combination alarms in an Overcloud. This defaults to false. Set to true to continue using combination alarms in OpenStack Platform 10.
    • OpenStack Platform 10 includes a migration script (aodh-data-migration) to move to composite alarms. This guide contains instructions for migrating this data in Section 3.6.9, “Migrating the OpenStack Telemetry Alarming Database”. Make sure to run this script and convert your alarms to composite.
    • Combination alarms support will be removed in the next release.

3.3. Checking the Overcloud

Check your overcloud is stable before performing the upgrade. Run the following steps on the director to ensure all services in your overcloud are running:

  1. Check the status of the high availability services:

    ssh heat-admin@[CONTROLLER_IP] "sudo pcs resource cleanup ; sleep 60 ; sudo pcs status"

    Replace [CONTROLLER_IP] with the IP address of a Controller node. This command refreshes the overcloud’s Pacemaker cluster, waits 60 seconds, then reports the status of the cluster.

  2. Check for any failed OpenStack Platform systemd services on overcloud nodes. The following command checks for failed services on all nodes:

    $ for IP in $(openstack server list -c Networks -f csv | sed '1d' | sed 's/"//g' | cut -d '=' -f2) ; do echo "Checking systemd services on $IP" ; ssh heat-admin@$IP "sudo systemctl list-units 'openstack-*' 'neutron-*' --state=failed --no-legend" ; done
  3. Check that os-collect-config is running on each node. The following command checks this service on each node:

    $ for IP in $(openstack server list -c Networks -f csv | sed '1d' | sed 's/"//g' | cut -d '=' -f2) ; do echo "Checking os-collect-config on $IP" ; ssh heat-admin@$IP "sudo systemctl list-units 'os-collect-config.service' --no-legend" ; done

3.4. Undercloud Upgrade

3.4.1. Upgrading the Director

To upgrade the Red Hat OpenStack Platform director, follow this procedure:

  1. Log into the director as the stack user.
  2. Update the OpenStack Platform repository:

    $ sudo subscription-manager repos --disable=rhel-7-server-openstack-9-rpms --disable=rhel-7-server-openstack-9-director-rpms
    $ sudo subscription-manager repos --enable=rhel-7-server-openstack-10-rpms

    This sets yum to use the latest repositories.

  3. Stop the main OpenStack Platform services:

    $ sudo systemctl stop 'openstack-*' 'neutron-*' httpd
    Note

    This causes a short period of downtime for the undercloud. The overcloud is still functional during the undercloud upgrade.

  4. Use yum to upgrade the director:

    $ sudo yum update python-tripleoclient
  5. Use the following command to upgrade the undercloud:

    $ openstack undercloud upgrade

    This command upgrades the director’s packages, refreshes the director’s configuration, and populates any settings that are unset since the version change. This command does not delete any stored data, such Overcloud stack data or data for existing nodes in your environment.

Review the resulting configuration files for each service. The upgraded packages might have installed .rpmnew files appropriate to the Red Hat OpenStack Platform 10 version of each service.

Check the /var/log/yum.log file on the undercloud node to see if either the kernel or openvswitch packages have updated their major or minor versions. If they have, perform a reboot of the undercloud:

  1. Reboot the node:

    $ sudo reboot
  2. Wait until the node boots.

When the node boots, check the status of all services:

$ sudo systemctl list-units "openstack*" "neutron*" "openvswitch*"
Note

It might take approximately 10 minutes for the openstack-nova-compute to become active after a reboot.

Verify the existence of your Overcloud and its nodes:

$ source ~/stackrc
$ openstack server list
$ openstack baremetal node list
$ openstack stack list
Important

If using customized core Heat templates, make sure to check for differences between the updated core Heat templates and your current set. Red Hat provides updates to the Heat template collection over subsequent releases. Using a modified template collection can lead to a divergence between your custom copy and the original copy in /usr/share/openstack-tripleo-heat-templates. Run the following command to see differences between your custom Heat template collection and the updated original version:

# diff -Nar /usr/share/openstack-tripleo-heat-templates/ ~/templates/my-overcloud/

Make sure to either apply these updates to your custom Heat template collection, or create a new copy of the templates in /usr/share/openstack-tripleo-heat-templates/ and apply your customizations.

3.4.2. Upgrading the Overcloud Images on the Director

This procedure ensures you have the latest images for node discovery and Overcloud deployment. The new images from the rhosp-director-images and rhosp-director-images-ipa packages are already updated from the Undercloud upgrade.

Remove any existing images from the images directory on the stack user’s home (/home/stack/images):

$ rm -rf ~/images/*

Extract the archives:

$ cd ~/images
$ for i in /usr/share/rhosp-director-images/overcloud-full-latest-10.0.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-10.0.tar; do tar -xvf $i; done

Import the latest images into the director and configure nodes to use the new images

$ openstack overcloud image upload --update-existing --image-path /home/stack/images/.
$ openstack baremetal configure boot

To finalize the image update, verify the existence of the new images:

$ openstack image list
$ ls -l /httpboot

The director is now upgraded with the latest images.

Important

Make sure the Overcloud image version corresponds to the Undercloud version.

3.4.3. Using and Comparing Previous Template Versions

The upgrade process installs a new set of core Heat templates that correspond to the latest overcloud version. Red Hat OpenStack Platform’s repository retains the previous version of the core template collection in the openstack-tripleo-heat-templates-compat package. You install this package with the following command:

$ sudo yum install openstack-tripleo-heat-templates-compat

This installs the previous templates in the compat directory of your Heat template collection (/usr/share/openstack-tripleo-heat-templates/compat) and also creates a link to compat named after the previous version (mitaka). These templates are backwards compatible with the upgraded director, which means you can use the latest version of the director to install an overcloud of the previous version.

Comparing the previous version with the latest version helps identify changes to the overcloud during the upgrade. If you need to compare the current template collection with the previous version, use the following process:

  1. Create a temporary copy of the core Heat templates:

    $ cp -a /usr/share/openstack-tripleo-heat-templates /tmp/osp10
  2. Move the previous version into its own directory:

    $ mv /tmp/osp10/compat /tmp/osp9
  3. Perform a diff on the contents of both directories:

    $ diff -urN /tmp/osp9 /tmp/osp10

This shows the core template changes from one version to the next. These changes provide an idea of what should occur during the overcloud upgrade.

3.5. Overcloud Pre-Upgrade Configuration

3.5.1. Red Hat Subscription Details

If using an environment file for Satellite registration, make sure to update the following parameters in the environment file:

  • rhel_reg_repos - Repositories to enable for your Overcloud, including the new Red Hat OpenStack Platform 10 repositories. See Section 1.2, “Repository Requirements” for repositories to enable.
  • rhel_reg_activation_key - The new activation key for your Red Hat OpenStack Platform 10 repositories.
  • rhel_reg_sat_repo - A new parameter that defines the repository containing Red Hat Satellite 6’s management tools, such as katello-agent. Make sure to add this parameter if registering to Red Hat Satellite 6.

3.5.2. SSL Configuration

If upgrading an overcloud that uses SSL, be aware of the following:

  • The network configuration requires a PublicVirtualFixedIPs parameter in the following format:

      PublicVirtualFixedIPs: [{'ip_address':'192.168.200.180'}]

    Include this in the parameter_defaults section of your network environment file.

  • A new environment file with SSL endpoints. This file depends on whether accessing the overcloud through IP addresses or DNS.

    • If using IP addresses, use /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml.
    • If using DNS, use /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-dns.yaml.
  • For more information about SSL/TLS configuration, see "Enabling SSL/TLS on the Overcloud" in the Red Hat OpenStack Platform Advanced Overcloud Customization guide.

3.5.3. Ceph Storage

If using a custom storage-environment.yaml file, check that the resource_registry section includes the following new resources:

resource_registry:
  OS::TripleO::Services::CephMon: /usr/share/openstack-tripleo-heat-templates/puppet/services/ceph-mon.yaml
  OS::TripleO::Services::CephOSD: /usr/share/openstack-tripleo-heat-templates/puppet/services/ceph-osd.yaml
  OS::TripleO::Services::CephClient: /usr/share/openstack-tripleo-heat-templates/puppet/services/ceph-client.yaml

These resources ensure the Ceph Storage composable services are enabled for Red Hat OpenStack Platform 10. The default storage-environment.yaml file for Red Hat OpenStack Platform 10 is now updated to include these resources.

3.5.4. OpenStack Telemetry Metrics

Red Hat OpenStack Platform 10 introduces a new component to store metrics data. If using a Red Hat Ceph Storage cluster deployed with a custom storage-environment.yaml file, check the file’s parameters_default section for the following new parameters:

  • GnocchiBackend - The backend to use. Set this rbd (Ceph Storage). Other options include swift or file.
  • GnocchiRbdPoolName - The name of the Ceph Storage pool to use for metrics data. The default is metrics.

If using an external Ceph Storage cluster (i.e. one not managed with director), you must manually add the pool defined in GnocchiRbdPoolName (for example, the default is metrics) before performing the upgrade.

3.5.5. Overcloud Parameters

Note the following information about overcloud parameters for upgrades:

  • The default timezone for Red Hat OpenStack Platform 10 is UTC. If necessary, include an environment file to specify the timezone.
  • If upgrading an Overcloud with a custom ServiceNetMap, ensure to include the latest ServiceNetMap for the new services. The default list of services is defined with the ServiceNetMapDefaults parameter located in the network/service_net_map.j2.yaml file. For information on using a custom ServiceNetMap, see Isolating Networks in Advanced Overcloud Customization.
  • Due to the new composable service architecture, the parameters for configuring the NFS backend for OpenStack Image Storage (Glance) have changed. The new parameters are:

    GlanceNfsEnabled
    Enables Pacemaker to manage the share for image storage. If disabled, the Overcloud stores images in the Controller node’s file system. Set to true.
    GlanceNfsShare
    The NFS share to mount for image storage. For example, 192.168.122.1:/export/glance.
    GlanceNfsOptions
    The NFS mount options for the image storage.
  • Due to the new composable service architecture, the syntax for some of the configuration hooks have changes. If using pre or post configuration hooks to provide custom scripts to your environment, check the syntax of your custom environment files. Use the following sections from the Advanced Overcloud Customization guide:

  • Some composable services include new parameters that configure Puppet hieradata. If you used hieradata to configure these parameters in the past, the overcloud update might report a Duplicate declaration error. If this situation, use the composable service parameter. For example, instead of the following:

    parameter_defaults:
      controllerExtraConfig:
        heat::config::heat_config:
          DEFAULT/num_engine_workers:
            value: 1

    Use the following:

    parameter_defaults:
      HeatWorkers: 1

3.5.6. Custom Core Templates

Important

This section is only required if using a modified version of the core Heat template collection. This is because the copy is a static snapshot of the original core Heat template collection from /usr/share/openstack-tripleo-heat-templates/. If using an unmodified core Heat template collection for your overcloud, you can skip this section.

To update your modified template collection, you need to:

  1. Backup your existing custom template collection:

    $ mv ~/templates/my-overcloud/ ~/templates/my-overcloud.bak
  2. Replace the new version of the template collection from /usr/share/openstack-tripleo-heat-templates:

    $ sudo cp -rv /usr/share/openstack-tripleo-heat-templates ~/templates/my-overcloud/
  3. Check for differences between the old and new custom template collection. To see changes between the two, use the following diff command:

    $ diff -Nar ~/templates/my-overcloud.bak/ ~/templates/my-overcloud/

This helps identify customizations from the old template collection that you can incorporate into the new template collection. Incorporate customization into the new custom template collection.

Important

Red Hat provides updates to the Heat template collection over subsequent releases. Using a modified template collection can lead to a divergence between your custom copy and the original copy in /usr/share/openstack-tripleo-heat-templates. To customize your overcloud, Red Hat recommends using the Configuration Hooks from the Advanced Overcloud Customization guide. If creating a copy of the Heat template collection, you should track changes to the templates using a version control system such as git.

3.6. Upgrading the Overcloud

3.6.1. Overview and Workflow

This section details the steps required to upgrade the Overcloud. Make sure to follow each section in order and only apply the sections relevant to your environment.

This process requires you to run your original openstack overcloud deploy command multiple times to provide a staged method of upgrading. Each time you run the command, you include a different upgrade environment file along with your existing environment files. These new upgrade environments files are:

  • major-upgrade-ceilometer-wsgi-mitaka-newton.yaml - Converts OpenStack Telemetry (‘Ceilometer’) to a WSGI service.
  • major-upgrade-pacemaker-init.yaml - Provides the initialization for the upgrade. This includes updating the Red Hat OpenStack Platform repositories on each node in your Overcloud and provides special upgrade scripts to certain nodes.
  • major-upgrade-pacemaker.yaml - Provides an upgrade for the Controller nodes.
  • (Optional) major-upgrade-remove-sahara.yaml - Removes OpenStack Clustering (sahara) from the Overcloud. This accommodates a difference between OpenStack Platform 9 and 10. See Section 3.6.5, “Upgrading Controller Nodes” for more information.
  • major-upgrade-pacemaker-converge.yaml - The finalization for the Overcloud upgrade. This aligns the resulting upgrade to match the contents for the director’s latest Heat template collection.
  • major-upgrade-aodh-migration.yaml - Migrates the OpenStack Telemetry Alarming (aodh) service’s database from MongoDB to MariaDB

In between these deployment commands, you run the upgrade-non-controller.sh script on various node types. This script upgrades the packages on a non-Controller node.

Workflow

The Overcloud upgrade process uses the following workflow:

  1. Run your deployment command including the major-upgrade-ceilometer-wsgi-mitaka-newton.yaml environment file.
  2. Run your deployment command including the major-upgrade-pacemaker-init.yaml environment file.
  3. Run the upgrade-non-controller.sh on each Object Storage node.
  4. Run your deployment command including the major-upgrade-pacemaker.yaml and the optional major-upgrade-remove-sahara.yaml environment file.
  5. Run the upgrade-non-controller.sh on each Ceph Storage node.
  6. Run the upgrade-non-controller.sh on each Compute node.
  7. Run your deployment command including the major-upgrade-pacemaker-converge.yaml environment file.
  8. Run your deployment command including the major-upgrade-aodh-migration.yaml environment file.

3.6.2. Upgrading OpenStack Telemetry to a WSGI Service

This step upgrades the OpenStack Telemetry (ceilometer) service to run as a Web Server Gateway Interface (WSGI) applet under httpd instead of a standalone service. This process automatically disables the standalone openstack-ceilometer-api service and installs the necessary configuration to enable the WSGI applet.

Run the openstack overcloud deploy from your Undercloud and include the major-upgrade-ceilometer-wsgi-mitaka-newton.yaml environment file. Make sure you also include all options and custom environment files relevant to your environment, such as network isolation and storage.

This following is an example of an openstack overcloud deploy command with the added file:

$ openstack overcloud deploy --templates \
  --control-scale 3 \
  --compute-scale 3 \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml  \
  -e /home/stack/templates/network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-ceilometer-wsgi-mitaka-newton.yaml \
  --ntp-server pool.ntp.org

Wait until the Overcloud updates with the new environment file’s configuration.

Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

3.6.3. Installing the Upgrade Scripts

This step installs scripts on each non-Controller node. These script perform the major version package upgrades and configuration. Each script differs depending on the node type. For example, Compute nodes receive different upgrade scripts to Ceph Storage nodes.

This initialization step also updates enabled repositories on all overcloud nodes. This means you do not need to disable old repositories and enable new repositories manually.

Run the openstack overcloud deploy from your Undercloud and include the major-upgrade-pacemaker-init.yaml environment file. Make sure you also include all options and custom environment files relevant to your environment, such as network isolation and storage.

This following is an example of an openstack overcloud deploy command with the added file:

$ openstack overcloud deploy --templates \
  --control-scale 3 \
  --compute-scale 3 \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml  \
  -e /home/stack/templates/network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-init.yaml \
  --ntp-server pool.ntp.org

Wait until the Overcloud updates with the new environment file’s configuration.

Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

3.6.4. Upgrading Object Storage Nodes

The director uses the upgrade-non-controller.sh command to run the upgrade script passed to each non-Controller node from the major-upgrade-pacemaker-init.yaml environment file. For this step, upgrade each Object Storage node using the following command:

$ for NODE in `openstack server list -c Name -f value --name objectstorage` ; do upgrade-non-controller.sh --upgrade $NODE ; done

Wait until each Object Storage node completes its upgrade.

Check the /var/log/yum.log file on each Object Storage node to see if either the kernel or openvswitch packages have updated their major or minor versions. If so, perform a reboot of each Object Storage node:

  1. Select a Object Storage node to reboot. Log into it and reboot it:

    $ sudo reboot
  2. Wait until the node boots.
  3. Log into the node and check the status:

    $ sudo systemctl list-units "openstack-swift*"
  4. Log out of the node and repeat this process on the next Object Storage node.
Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

3.6.5. Upgrading Controller Nodes

Upgrading the Controller nodes involves including another environment file (major-upgrade-pacemaker.yaml) that provides a full upgrade to Controller nodes running high availability tools.

Run the openstack overcloud deploy from your Undercloud and include the major-upgrade-pacemaker.yaml environment file. Remember to include all options and custom environment files relevant to your environment, such as network isolation and storage.

Your Controller nodes might require an additional file depending on whether you aim to keep the OpenStack Data Processing (sahara) service enabled. OpenStack Platform 9 automatically installed OpenStack Data Processing for a default Overcloud. In OpenStack Platform 10, the user needs to explicitly include environment files to enable OpenStack Data Processing. This means:

  • If you no longer require OpenStack Data Processing, include major-upgrade-remove-sahara.yaml file in the deployment.
  • If you aim to keep OpenStack Data Processing, do not include the major-upgrade-remove-sahara.yaml file in the deployment. After completing the Overcloud upgrade, make sure to include the /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml to keep the service enabled and configured.

The following is an example of an openstack overcloud deploy command with both the required and optional files:

$ openstack overcloud deploy --templates \
  --control-scale 3 \
  --compute-scale 3 \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml \
  -e network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-remove-sahara.yaml \
  --ntp-server pool.ntp.org

Wait until the Overcloud updates with the new environment file’s configuration.

Important

Note the following:

  • This step disables the Neutron server and L3 Agent during the Controller upgrade. This means floating IP address are unavailable during this step.
  • This step revises the Pacemaker configuration for the Controller cluster. This means the upgrade disables certain high availability functions temporarily.

Check the /var/log/yum.log file on each Controller node to see if either the kernel or openvswitch packages have updated their major or minor versions. If so, perform a reboot of each Controller node:

  1. Select a node to reboot. Log into it and stop the cluster before rebooting:

    $ sudo pcs cluster stop
  2. Reboot the cluster:

    $ sudo reboot

    The remaining Controller Nodes in the cluster retain the high availability services during the reboot.

  3. Wait until the node boots.
  4. Re-enable the cluster for the node:

    $ sudo pcs cluster start
  5. Log into the node and check the cluster status:

    $ sudo pcs status

    The node rejoins the cluster.

    Note

    If any services fail after the reboot, run sudo pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

  6. Check all systemd services on the Controller Node are active:

    $ sudo systemctl list-units "openstack*" "neutron*" "openvswitch*"
  7. Log out of the node, select the next Controller Node to reboot, and repeat this procedure until you have rebooted all Controller Nodes.
Important

The OpenStack Platform 10 upgrade procedure migrates to a new composable architecture. This means many services that Pacemaker managed in previous versions now use systemd management. This results in a reduced number of Pacemaker managed resources.

3.6.6. Upgrading Ceph Storage Nodes

The director uses the upgrade-non-controller.sh command to run the upgrade script passed to each non-Controller node from the major-upgrade-pacemaker-init.yaml environment file. For this step, upgrade each Ceph Storage node with the following command:

Upgrade each Ceph Storage nodes:

$ for NODE in `openstack server list -c Name -f value --name ceph` ; do upgrade-non-controller.sh --upgrade $NODE ; done

Check the /var/log/yum.log file on each Ceph Storage node to see if either the kernel or openvswitch packages have updated their major or minor versions. If so, perform a reboot of each Ceph Storage node:

  1. Log into a Ceph MON or Controller node and disable Ceph Storage cluster rebalancing temporarily:

    $ sudo ceph osd set noout
    $ sudo ceph osd set norebalance
  2. Select the first Ceph Storage node to reboot and log into it.
  3. Reboot the node:

    $ sudo reboot
  4. Wait until the node boots.
  5. Log into the node and check the cluster status:

    $ sudo ceph -s

    Check that the pgmap reports all pgs as normal (active+clean).

  6. Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.
  7. When complete, log into a Ceph MON or Controller node and enable cluster rebalancing again:

    $ sudo ceph osd unset noout
    $ sudo ceph osd unset norebalance
  8. Perform a final status check to verify the cluster reports HEALTH_OK:

    $ sudo ceph status
Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

3.6.7. Upgrading Compute Nodes

Upgrade each Compute node individually and ensure zero downtime of instances in your OpenStack Platform environment. This involves the following workflow:

  1. Select a Compute node to upgrade
  2. Migrate its instances to another Compute node
  3. Upgrade the empty Compute node

List all Compute nodes and their UUIDs:

$ openstack server list | grep "compute"

Select a Compute node to upgrade and first migrate its instances using the following process:

  1. From the undercloud, select a Compute Node to reboot and disable it:

    $ source ~/overcloudrc
    $ openstack compute service list
    $ openstack compute service set [hostname] nova-compute --disable
  2. List all instances on the Compute node:

    $ openstack server list --host [hostname] --all-projects
  3. Migrate each instance from the disabled host. Use one of the following commands:

    1. Migrate the instance to a specific host of your choice:

      $ openstack server migrate [instance-id] --live [target-host]--wait
    2. Let nova-scheduler automatically select the target host:

      $ nova live-migration [instance-id]
      Note

      The nova command might cause some deprecation warnings, which are safe to ignore.

  4. Wait until migration completes.
  5. Confirm the instance has migrated from the Compute node:

    $ openstack server list --host [hostname] --all-projects
  6. Repeat this step until you have migrated all instances from the Compute Node.
Important

For full instructions on configuring and migrating instances, see "Migrating Virtual Machines Between Compute Nodes" in the Director Installation and Usage guide.

The director uses the upgrade-non-controller.sh command to run the upgrade script passed to each non-Controller node from the major-upgrade-pacemaker-init.yaml environment file. Upgrade each Compute node with the following command:

$ source ~/stackrc
$ upgrade-non-controller.sh --upgrade NODE_UUID

Replace NODE_UUID with the UUID of the chosen Compute node. Wait until the Compute node completes its upgrade.

Check the /var/log/yum.log file on the Compute node you have upgraded to see if either the kernel or openvswitch packages have updated their major or minor versions. If so, perform a reboot of the Compute node:

  1. Log into the Compute Node and reboot it:

    $ sudo reboot
  2. Wait until the node boots.
  3. Enable the Compute Node again:

    $ source ~/overcloudrc
    $ openstack compute service set [hostname] nova-compute --enable
  4. Select the next node to reboot.

Repeat the migration and reboot process for each node individually until you have rebooted all nodes.

Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

3.6.8. Finalizing the Upgrade

The director needs to run through the upgrade finalization to ensure the Overcloud stack is synchronized with the current Heat template collection. This involves an environment file (major-upgrade-pacemaker-converge.yaml), which you include using the openstack overcloud deploy command.

Important

If your Red Hat OpenStack Platform 9 environment is integrated with an external Ceph Storage Cluster from an earlier version (that is, Red Hat Ceph Storage 1.3), you need to enable backwards compatibility. To do so, create an environment file (for example, /home/stack/templates/ceph-backwards-compatibility.yaml) containing the following:

parameter_defaults:
  ExtraConfig:
    ceph::conf::args:
      client/rbd_default_features:
        value: "1"

Then, include this file in when you run openstack overcloud deploy in the next step.

Run the openstack overcloud deploy from your Undercloud and include the major-upgrade-pacemaker-converge.yaml environment file. Make sure you also include all options and custom environment files relevant to your environment, such as backwards compatibility for Ceph (if applicable), network isolation, and storage.

This following is an example of an openstack overcloud deploy command with the added major-upgrade-pacemaker-converge.yaml file:

$ openstack overcloud deploy --templates \
  --control-scale 3 \
  --compute-scale 3 \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml \
  -e network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-converge.yaml \
  --ntp-server pool.ntp.org

Wait until the Overcloud updates with the new environment file’s configuration.

Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

3.6.9. Migrating the OpenStack Telemetry Alarming Database

This step migrates the OpenStack Telemetry Alarming (aodh) service’s database from MongoDB to MariaDB. This process automatically performs the migration.

Run the openstack overcloud deploy from your Undercloud and include the major-upgrade-aodh-migration.yaml environment file. Make sure you also include all options and custom environment files relevant to your environment, such as network isolation and storage.

This following is an example of an openstack overcloud deploy command with the added file:

$ openstack overcloud deploy --templates \
  --control-scale 3 \
  --compute-scale 3 \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml  \
  -e /home/stack/templates/network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-aodh-migration.yaml \
  --ntp-server pool.ntp.org

Wait until the Overcloud updates with the new environment file’s configuration.

Note

Login to a Controller node and run the pcs status command to check if all resources are active in the Controller cluster. If any resource have failed, run pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

This completes the Overcloud upgrade procedure.

3.7. Post-Upgrade Notes for the Overcloud

Be aware of the following notes after upgrading the Overcloud to Red Hat OpenStack Platform 10:

  • Review the resulting configuration files for each service. The upgraded packages might have installed .rpmnew files appropriate to the Red Hat OpenStack Platform 10 version of each service.
  • If you did not include optional major-upgrade-remove-sahara.yaml file from Section 3.6.5, “Upgrading Controller Nodes”, make sure to include the /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml to ensure OpenStack Clustering (sahara) stays enabled in the overcloud.
  • The Compute nodes might report a failure with neutron-openvswitch-agent. If this occurs, log into each Compute node and restart the service. For example:

    $ sudo systemctl restart neutron-openvswitch-agent
  • The upgrade process does not reboot any nodes in the Overcloud automatically. If required, perform a reboot manually after the upgrade command completes. Make sure to reboot cluster-based nodes (such as Ceph Storage nodes and Controller nodes) individually and wait for the node to rejoin the cluster. For Ceph Storage nodes, check with the ceph health and make sure the cluster status is HEALTH OK. For Controller nodes, check with the pcs resource and make sure all resources are running for each node.
  • In some circumstances, the corosync service might fail to start on IPv6 environments after rebooting Controller nodes. This is due to Corosync starting before the Controller node configures the static IPv6 addresses. In these situations, restart Corosync manually on the Controller nodes:

    $ sudo systemctl restart corosync
  • If you configured fencing for your Controller nodes, the upgrade process might disable it. When the upgrade process completes, reenable fencing with the following command on one of the Controller nodes:

    $ sudo pcs property set stonith-enabled=true
  • The next time you update or scale the Overcloud stack (i.e. running the openstack overcloud deploy command), you need to reset the identifier that triggers package updates in the Overcloud. Add a blank UpdateIdentifier parameter to an environment file and include it when you run the openstack overcloud deploy command. The following is an example of such an environment file:

    parameter_defaults:
      UpdateIdentifier: