Chapter 2. Preparing for an OpenStack Platform Upgrade

This process prepares your OpenStack Platform environment for a full update.

2.1. Support Statement

A successful upgrade process requires some preparation to accommodate changes from one major version to the next. Read the following support statement to help with Red Hat OpenStack Platform upgrade planning.

Upgrades in Red Hat OpenStack Platform director require full testing with specific configurations before being performed on any live production environment. Red Hat has tested most use cases and combinations offered as standard options through the director. However, due to the number of possible combinations, this is never a fully exhaustive list. In addition, if the configuration has been modified from the standard deployment, either manually or through post configuration hooks, testing upgrade features in a non-production environment is critical. Therefore, we advise you to:

  • Perform a backup of your Undercloud node before starting any steps in the upgrade procedure.
  • Run the upgrade procedure with your customizations in a test environment before running the procedure in your production environment.
  • If you feel uncomfortable about performing this upgrade, contact Red Hat’s support team and request guidance and assistance on the upgrade process before proceeding.

The upgrade process outlined in this section only accommodates customizations through the director. If you customized an Overcloud feature outside of director then:

  • Disable the feature.
  • Upgrade the Overcloud.
  • Re-enable the feature after the upgrade completes.

This means the customized feature is unavailable until the completion of the entire upgrade.

Red Hat OpenStack Platform director 12 can manage previous Overcloud versions of Red Hat OpenStack Platform. See the support matrix below for information.

Table 2.1. Support Matrix for Red Hat OpenStack Platform director 12

VersionOvercloud UpdatingOvercloud DeployingOvercloud Scaling

Red Hat OpenStack Platform 12

Red Hat OpenStack Platform 12 and 11

Red Hat OpenStack Platform 12 and 11

Red Hat OpenStack Platform 12 and 11

2.2. General Upgrade Tips

The following are some tips to help with your upgrade:

  • After each step, run the pcs status command on the Controller node cluster to ensure no resources have failed.
  • Please contact Red Hat and request guidance and assistance on the upgrade process before proceeding if you feel uncomfortable about performing this upgrade.

2.3. Validating the Undercloud before an Upgrade

The following is a set of steps to check the functionality of your Red Hat OpenStack Platform 11 undercloud before an upgrade.

Procedure

  1. Source the undercloud access details:

    $ source ~/stackrc
  2. Check for failed Systemd services:

    (undercloud) $ sudo systemctl list-units --state=failed 'openstack*' 'neutron*' 'httpd' 'docker'
  3. Check the undercloud free space:

    (undercloud) $ df -h

    Use the "Undercloud Reqirements" as a basis to determine if you have adequate free space.

  4. Check that clocks are synchronized on the undercloud:

    (undercloud) $ sudo ntpstat
  5. Check the undercloud network services:

    (undercloud) $ openstack network agent list

    All agents should be Alive and their state should be UP.

  6. Check the undercloud compute services:

    (undercloud) $ openstack compute service list

    All agents' status should be enabled and their state should be up

Related Information

2.4. Validating the Overcloud before an Upgrade

The following is a set of steps to check the functionality of your Red Hat OpenStack Platform 11 overcloud before an upgrade.

Procedure

  1. Source the undercloud access details:

    $ source ~/stackrc
  2. Check the status of your bare metal nodes:

    (undercloud) $ openstack baremetal node list

    All nodes should have a valid power state (on) and maintenance mode should be false.

  3. Check for failed Systemd services:

    (undercloud) $ for NODE in $(openstack server list -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo systemctl list-units --state=failed 'openstack*' 'neutron*' 'httpd' 'docker' 'ceph*'" ; done
  4. Check the HAProxy connection to all services. Obtain the Control Plane VIP address and authentication details for the haproxy.stats service:

    (undercloud) $ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE sudo 'grep "listen haproxy.stats" -A 6 /etc/haproxy/haproxy.cfg'

    Use these details in the following cURL request:

    (undercloud) $ curl -s -u admin:<PASSWORD> "http://<IP ADDRESS>:1993/;csv" | egrep -vi "(frontend|backend)" | awk -F',' '{ print $1" "$2" "$18 }'

    Replace <PASSWORD> and <IP ADDRESS> details with the respective details from the haproxy.stats service. The resulting list shows the OpenStack Platform services on each node and their connection status.

  5. Check overcloud database replication health:

    (undercloud) $ for NODE in $(openstack server list --name controller -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo clustercheck" ; done
  6. Check RabbitMQ cluster health:

    (undercloud) $ for NODE in $(openstack server list --name controller -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo rabbitmqctl node_health_check" ; done
  7. Check Pacemaker resource health:

    (undercloud) $ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE "sudo pcs status"

    Look for:

    • All cluster nodes online.
    • No resources stopped on any cluster nodes.
    • No failed pacemaker actions.
  8. Check the disk space on each overcloud node:

    (undercloud) $ for NODE in $(openstack server list -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo df -h --output=source,fstype,avail -x overlay -x tmpfs -x devtmpfs" ; done
  9. Check overcloud Ceph Storage cluster health. The following command runs the ceph tool on a Controller node to check the cluster:

    (undercloud) $ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE "sudo ceph -s"
  10. Check Ceph Storage OSD for free space. The following command runs the ceph tool on a Controller node to check the free space:

    (undercloud) $ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE "sudo ceph df"
  11. Check that clocks are synchronized on overcloud nodes

    (undercloud) $ for NODE in $(openstack server list -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo ntpstat" ; done
  12. Source the overcloud access details:

    (undercloud) $ source ~/overcloudrc
  13. Check the overcloud network services:

    (overcloud) $ openstack network agent list

    All agents should be Alive and their state should be UP.

  14. Check the overcloud compute services:

    (overcloud) $ openstack compute service list

    All agents' status should be enabled and their state should be up

  15. Check the overcloud volume services:

    (overcloud) $ openstack volume service list

    All agents' status should be enabled and their state should be up.

Related Information

2.5. Backing up the Undercloud

A full undercloud backup includes the following databases and files:

  • All MariaDB databases on the undercloud node
  • MariaDB configuration file on the undercloud (so that you can accurately restore databases)
  • All swift data in /srv/node
  • All data in the stack user home directory: /home/stack
  • The undercloud SSL certificates:

    • /etc/pki/ca-trust/source/anchors/ca.crt.pem
    • /etc/pki/instack-certs/undercloud.pem
Note

Confirm that you have sufficient disk space available before performing the backup process. The tarball can be expected to be at least 3.5 GB, but this is likely to be larger.

Procedure

  1. Log into the undercloud as the root user.
  2. Back up the database:

    # mysqldump --opt --all-databases > /root/undercloud-all-databases.sql
  3. Archive the database backup and the configuration files:

    # tar --xattrs -czf undercloud-backup-`date +%F`.tar.gz /root/undercloud-all-databases.sql /etc/my.cnf.d/server.cnf /srv/node /home/stack /etc/pki/instack-certs/undercloud.pem /etc/pki/ca-trust/source/anchors/ca.crt.pem

    This creates a file named undercloud-backup-[timestamp].tar.gz.

Related Information

  • If you need to restore the undercloud backup, see the "Restore" chapter in the Back Up and Restore the Director Undercloud guide.

2.6. Updating the Current Undercloud Packages

The director provides commands to update the packages on the undercloud node. This allows you to perform a minor update within the current version of your OpenStack Platform environment. This is a minor update within Red Hat OpenStack Platform 11.

Prerequisites

  • You have performed a backup of the undercloud.

Procedure

  1. Log into the director as the stack user.
  2. Update the python-tripleoclient package and its dependencies to ensure you have the latest scripts for the minor version update:

    $ sudo yum update -y python-tripleoclient
  3. The director uses the openstack undercloud upgrade command to update the Undercloud environment. Run the command:

    $ openstack undercloud upgrade
  4. Reboot the node:

    $ sudo reboot
  5. Wait until the node boots.
  6. Check the status of all services:

    $ sudo systemctl list-units "openstack*" "neutron*" "openvswitch*"
    Note

    It might take approximately 10 minutes for the openstack-nova-compute to become active after a reboot.

  7. Verify the existence of your overcloud and its nodes:

    $ source ~/stackrc
    $ openstack server list
    $ openstack baremetal node list
    $ openstack stack list

2.7. Updating the Current Overcloud Images

The undercloud update process might download new image archives from the rhosp-director-images and rhosp-director-images-ipa packages. This process updates these images on your undercloud within Red Hat OpenStack Platform 11.

Prerequisites

  • You have updated to the latest minor release of your current undercloud version.

Procedure

  1. Check the yum log to determine if new image archives are available:

    $ sudo grep "rhosp-director-images" /var/log/yum.log
  2. If new archives are available, replace your current images with new images. To install the new images, first remove any existing images from the images directory on the stack user’s home (/home/stack/images):

    $ rm -rf ~/images/*
  3. Extract the archives:

    $ cd ~/images
    $ for i in /usr/share/rhosp-director-images/overcloud-full-latest-11.0.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-11.0.tar; do tar -xvf $i; done
  4. Import the latest images into the director and configure nodes to use the new images

    $ cd ~
    $ openstack overcloud image upload --update-existing --image-path /home/stack/images/
    $ openstack overcloud node configure $(openstack baremetal node list -c UUID -f csv --quote none | sed "1d" | paste -s -d " ")
  5. To finalize the image update, verify the existence of the new images:

    $ openstack image list
    $ ls -l /httpboot

    The director is now updated and using the latest images. You do not need to restart any services after the update.

2.8. Updating the Current Overcloud Packages

The director provides commands to update the packages on all overcloud nodes. This allows you to perform a minor update within the current version of your OpenStack Platform environment. This is a minor update within Red Hat OpenStack Platform 11.

Prerequisites

  • You have updated to the latest minor release of your current undercloud version.
  • You have performed a backup of the overcloud.

Procedure

  1. Update the current plan using your original openstack overcloud deploy command and including the --update-plan-only option. For example:

    $ openstack overcloud deploy --update-plan-only \
      --templates  \
      -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
      -e /home/stack/templates/network-environment.yaml \
      -e /home/stack/templates/storage-environment.yaml \
      -e /home/stack/templates/rhel-registration/environment-rhel-registration.yaml \
      [-e <environment_file>|...]

    The --update-plan-only only updates the Overcloud plan stored in the director. Use the -e option to include environment files relevant to your Overcloud and its update path. The order of the environment files is important as the parameters and resources defined in subsequent environment files take precedence. Use the following list as an example of the environment file order:

    • Any network isolation files, including the initialization file (environments/network-isolation.yaml) from the heat template collection and then your custom NIC configuration file.
    • Any external load balancing environment files.
    • Any storage environment files.
    • Any environment files for Red Hat CDN or Satellite registration.
    • Any other custom environment files.
  2. Perform a package update on all nodes using the openstack overcloud update command. For example:

    $ openstack overcloud update stack -i overcloud

    The -i runs an interactive mode to update each node. When the update process completes a node update, the script provides a breakpoint for you to confirm. Without the -i option, the update remains paused at the first breakpoint. Therefore, it is mandatory to include the -i option.

    Note

    Running an update on all nodes in parallel can cause problems. For example, an update of a package might involve restarting a service, which can disrupt other nodes. This is why the process updates each node using a set of breakpoints. This means nodes are updated one by one. When one node completes the package update, the update process moves to the next node.

  3. The update process starts. During this process, the director reports an IN_PROGRESS status and periodically prompts you to clear breakpoints. For example:

    not_started: [u'overcloud-controller-0', u'overcloud-controller-1', u'overcloud-controller-2']
    on_breakpoint: [u'overcloud-compute-0']
    Breakpoint reached, continue? Regexp or Enter=proceed, no=cancel update, C-c=quit interactive mode:

    Press Enter to clear the breakpoint from last node on the on_breakpoint list. This begins the update for that node. You can also type a node name to clear a breakpoint on a specific node, or a Python-based regular expression to clear breakpoints on multiple nodes at once. However, it is not recommended to clear breakpoints on multiple controller nodes at once. Continue this process until all nodes have completed their update.

  4. The update command reports a COMPLETE status when the update completes:

    ...
    IN_PROGRESS
    IN_PROGRESS
    IN_PROGRESS
    COMPLETE
    update finished with status COMPLETE
  5. If you configured fencing for your Controller nodes, the update process might disable it. When the update process completes, reenable fencing with the following command on one of the Controller nodes:

    $ sudo pcs property set stonith-enabled=true
  6. The update process does not reboot any nodes in the Overcloud automatically. Updates to the kernel or Open vSwitch require a reboot. Check the /var/log/yum.log file on each node to see if either the kernel or openvswitch packages have updated their major or minor versions. If they have, reboot each node using the "Rebooting Nodes" procedures in the Director Installation and Usage guide.