Red Hat Training
A Red Hat training course is available for Red Hat OpenStack Platform
Chapter 2. Preparing for an OpenStack Platform upgrade
This process prepares your OpenStack Platform environment. This involves the following steps:
- Backing up both the undercloud and overcloud.
- Updating the undercloud to the latest minor version of OpenStack Platform 10, including the latest Open vSwitch.
- Rebooting the undercloud in case a newer kernel or newer system packages are installed.
- Updating the overcloud to the latest minor version of OpenStack Platform 10, including the latest Open vSwitch.
- Rebooting the overcloud nodes in case a newer kernel or newer system packages are installed.
- Performing validation checks on both the undercloud and overcloud.
These procedures ensure your OpenStack Platform environment is in the best possible state before proceeding with the upgrade.
2.1. Creating a baremetal Undercloud backup
A full undercloud backup includes the following databases and files:
- All MariaDB databases on the undercloud node
- MariaDB configuration file on the undercloud (so that you can accurately restore databases)
-
The configuration data:
/etc
-
Log data:
/var/log
-
Image data:
/var/lib/glance
-
Certificate generation data if using SSL:
/var/lib/certmonger
-
Any container image data:
/var/lib/docker
and/var/lib/registry
-
All swift data:
/srv/node
-
All data in the stack user home directory:
/home/stack
Confirm that you have sufficient disk space available on the undercloud before performing the backup process. Expect the archive file to be at least 3.5 GB, if not larger.
Procedure
-
Log into the undercloud as the
root
user. Back up the database:
[root@director ~]# mysqldump --opt --all-databases > /root/undercloud-all-databases.sql
Create a
backup
directory and change the user ownership of the directory to thestack
user:[root@director ~]# mkdir /backup [root@director ~]# chown stack: /backup
You will use this directory to store the archive containing the undercloud database and file system.
Change to the
backup
directory[root@director ~]# cd /backup
Archive the database backup and the configuration files:
[root@director ~]# tar --xattrs --xattrs-include='*.*' --ignore-failed-read -cf \ undercloud-backup-$(date +%F).tar \ /root/undercloud-all-databases.sql \ /etc \ /var/log \ /var/lib/glance \ /var/lib/certmonger \ /var/lib/docker \ /var/lib/registry \ /srv/node \ /root \ /home/stack
-
The
--ignore-failed-read
option skips any directory that does not apply to your undercloud. -
The
--xattrs
and--xattrs-include='.'
options include extended attributes, which are required to store metadata for Object Storage (swift) and SELinux.
This creates a file named
undercloud-backup-<date>.tar.gz
, where<date>
is the system date. Copy thistar
file to a secure location.-
The
Related Information
- If you need to restore the undercloud backup, see Appendix A, Restoring the undercloud.
2.2. Backing up the overcloud control plane services
The following procedure creates a backup of the overcloud databases and configuration. A backup of the overcloud database and services ensures you have a snapshot of a working environment. Having this snapshot helps in case you need to restore the overcloud to its original state in case of an operational failure.
This procedure only includes crucial control plane services. It does not include backups of Compute node workloads, data on Ceph Storage nodes, nor any additional services.
Procedure
Perform the database backup:
Log into a Controller node. You can access the overcloud from the undercloud:
$ ssh heat-admin@192.0.2.100
Change to the
root
user:$ sudo -i
Create a temporary directory to store the backups:
# mkdir -p /var/tmp/mysql_backup/
Obtain the database password and store it in the
MYSQLDBPASS
environment variable. The password is stored in themysql::server::root_password
variable within the/etc/puppet/hieradata/service_configs.json
file. Use the following command to store the password:# MYSQLDBPASS=$(sudo hiera -c /etc/puppet/hiera.yaml mysql::server::root_password)
Backup the database:
# mysql -uroot -p$MYSQLDBPASS -s -N -e "select distinct table_schema from information_schema.tables where engine='innodb' and table_schema != 'mysql';" | xargs mysqldump -uroot -p$MYSQLDBPASS --single-transaction --databases > /var/tmp/mysql_backup/openstack_databases-$(date +%F)-$(date +%T).sql
This dumps a database backup called
/var/tmp/mysql_backup/openstack_databases-<date>.sql
where<date>
is the system date and time. Copy this database dump to a secure location.Backup all the users and permissions information:
# mysql -uroot -p$MYSQLDBPASS -s -N -e "SELECT CONCAT('\"SHOW GRANTS FOR ''',user,'''@''',host,''';\"') FROM mysql.user where (length(user) > 0 and user NOT LIKE 'root')" | xargs -n1 mysql -uroot -p$MYSQLDBPASS -s -N -e | sed 's/$/;/' > /var/tmp/mysql_backup/openstack_databases_grants-$(date +%F)-$(date +%T).sql
This dumps a database backup called
/var/tmp/mysql_backup/openstack_databases_grants-<date>.sql
where<date>
is the system date and time. Copy this database dump to a secure location.
Backup the Pacemaker configuration:
- Log into a Controller node.
Run the following command to create an archive of the current Pacemaker configuration:
# sudo pcs config backup pacemaker_controller_backup
-
Copy the resulting archive (
pacemaker_controller_backup.tar.bz2
) to a secure location.
Backup the OpenStack Telemetry database:
Connect to any controller and get the IP of the MongoDB primary instance:
# MONGOIP=$(sudo hiera -c /etc/puppet/hiera.yaml mongodb::server::bind_ip)
Create the backup:
# mkdir -p /var/tmp/mongo_backup/ # mongodump --oplog --host $MONGOIP --out /var/tmp/mongo_backup/
-
Copy the database dump in
/var/tmp/mongo_backup/
to a secure location.
Backup the Redis cluster:
Obtain the Redis endpoint from HAProxy:
# REDISIP=$(sudo hiera -c /etc/puppet/hiera.yaml redis_vip)
Obtain the master password for the Redis cluster:
# REDISPASS=$(sudo hiera -c /etc/puppet/hiera.yaml redis::masterauth)
Check connectivity to the Redis cluster:
# redis-cli -a $REDISPASS -h $REDISIP ping
Dump the Redis database:
# redis-cli -a $REDISPASS -h $REDISIP bgsave
This stores the database backup in the default
/var/lib/redis/
directory. Copy this database dump to a secure location.
Backup the filesystem on each Controller node:
Create a directory for the backup:
# mkdir -p /var/tmp/filesystem_backup/
Run the following
tar
command:# tar --acls --ignore-failed-read --xattrs --xattrs-include='*.*' \ -zcvf /var/tmp/filesystem_backup/`hostname`-filesystem-`date '+%Y-%m-%d-%H-%M-%S'`.tar \ /etc \ /srv/node \ /var/log \ /var/lib/nova \ --exclude /var/lib/nova/instances \ /var/lib/glance \ /var/lib/keystone \ /var/lib/cinder \ /var/lib/heat \ /var/lib/heat-config \ /var/lib/heat-cfntools \ /var/lib/rabbitmq \ /var/lib/neutron \ /var/lib/haproxy \ /var/lib/openvswitch \ /var/lib/redis \ /var/lib/os-collect-config \ /usr/libexec/os-apply-config \ /usr/libexec/os-refresh-config \ /home/heat-admin
The
--ignore-failed-read
option ignores any missing directories, which is useful if certain services are not used or separated on their own custom roles.-
Copy the resulting
tar
file to a secure location.
Archive deleted rows on the overcloud:
Check for archived deleted instances:
$ source ~/overcloudrc $ nova list --all-tenants --deleted
If there are no archived deleted instances, then archive the deleted instances by entering the following command on one of the overcloud Controller nodes:
# su - nova -s /bin/bash -c "nova-manage --debug db archive_deleted_rows --max_rows 1000"
Rerun this command until you have archived all deleted instances.
Purge all the archived deleted instances by entering the following command on one of the overcloud Controller nodes:
# su - nova -s /bin/bash -c "nova-manage --debug db purge --all --all-cells"
Verify that there are no remaining archived deleted instances:
$ nova list --all-tenants --deleted
Related Information
- If you need to restore the overcloud backup, see Appendix B, Restoring the overcloud.
2.3. Updating the current undercloud packages for OpenStack Platform 10.z
The director provides commands to update the packages on the undercloud node. This allows you to perform a minor update within the current version of your OpenStack Platform environment. This is a minor update within OpenStack Platform 10.
This step also updates the undercloud operating system to the latest version of Red Hat Enterprise Linux 7 and Open vSwitch to version 2.9.
Procedure
-
Log in to the undercloud as the
stack
user. Stop the main OpenStack Platform services:
$ sudo systemctl stop 'openstack-*' 'neutron-*' httpd
NoteThis causes a short period of downtime for the undercloud. The overcloud is still functional during the undercloud upgrade.
Set the RHEL version to RHEL 7.7:
$ sudo subscription-manager release --set=7.7
Update the
python-tripleoclient
package and its dependencies to ensure you have the latest scripts for the minor version update:$ sudo yum update -y python-tripleoclient
Run the
openstack undercloud upgrade
command:$ openstack undercloud upgrade
- Wait until the command completes its execution.
Reboot the undercloud to update the operating system’s kernel and other system packages:
$ sudo reboot
- Wait until the node boots.
-
Log into the undercloud as the
stack
user.
In addition to undercloud package updates, it is recommended to keep your overcloud images up to date to keep the image configuration in sync with the latest openstack-tripleo-heat-template
package. This ensures successful deployment and scaling operations in between the current preparation stage and the actual fast forward upgrade. The next section shows how to update your images in this scenario. If you aim to immediately upgrade your environment after preparing your environment, you can skip the next section.
2.4. Preparing updates for NFV-enabled environments
If your environment has network function virtualization (NFV) enabled, follow these steps after you update your undercloud, and before you update your overcloud.
Procedure
Change the vhost user socket directory in a custom environment file, for example,
network-environment.yaml
:parameter_defaults: NeutronVhostuserSocketDir: "/var/lib/vhost_sockets"
Add the
ovs-dpdk-permissions.yaml
file to youropenstack overcloud deploy
command to configure the qemu group setting ashugetlbfs
for OVS-DPDK:-e environments/ovs-dpdk-permissions.yaml
-
Ensure that vHost user ports for all instances are in
dpdkvhostuserclient
mode. For more information see Manually changing the vhost user port mode.
2.5. Updating the current overcloud images for OpenStack Platform 10.z
The undercloud update process might download new image archives from the rhosp-director-images
and rhosp-director-images-ipa
packages. This process updates these images on your undercloud within Red Hat OpenStack Platform 10.
Prerequisites
- You have updated to the latest minor release of your current undercloud version.
Procedure
Check the
yum
log to determine if new image archives are available:$ sudo grep "rhosp-director-images" /var/log/yum.log
If new archives are available, replace your current images with new images. To install the new images, first remove any existing images from the
images
directory on thestack
user’s home (/home/stack/images
):$ rm -rf ~/images/*
On the undercloud node, source the undercloud credentials:
$ source ~/stackrc
Extract the archives:
$ cd ~/images $ for i in /usr/share/rhosp-director-images/overcloud-full-latest-10.0.tar /usr/share/rhosp-director-images/ironic-python-agent-latest-10.0.tar; do tar -xvf $i; done
Import the latest images in to director and configure nodes to use the new images:
$ cd ~/images $ openstack overcloud image upload --update-existing --image-path /home/stack/images/ $ openstack overcloud node configure $(openstack baremetal node list -c UUID -f csv --quote none | sed "1d" | paste -s -d " ")
To finalize the image update, verify the existence of the new images:
$ openstack image list $ ls -l /httpboot
Director also retains the old images and renames them using the timestamp of when they were updated. If you no longer need these images, delete them.
Director is now updated and using the latest images. You do not need to restart any services after the update.
The undercloud is now using updated OpenStack Platform 10 packages. Next, update the overcloud to the latest minor release.
2.6. Updating the current overcloud packages for OpenStack Platform 10.z
The director provides commands to update the packages on all overcloud nodes. This allows you to perform a minor update within the current version of your OpenStack Platform environment. This is a minor update within Red Hat OpenStack Platform 10.
This step also updates the overcloud nodes' operating system to the latest version of Red Hat Enterprise Linux 7 and Open vSwitch to version 2.9.
Prerequisites
- You have updated to the latest minor release of your current undercloud version.
- You have performed a backup of the overcloud.
Procedure
Check your subscription management configuration for the
rhel_reg_release
parameter. If this parameter is not set, you must include it and set it version 7.7:parameter_defaults: ... rhel_reg_release: "7.7"
Ensure that you save the changes to the overcloud subscription management environment file.
Update the current plan using your original
openstack overcloud deploy
command and including the--update-plan-only
option. For example:$ openstack overcloud deploy --update-plan-only \ --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/templates/network-environment.yaml \ -e /home/stack/templates/storage-environment.yaml \ -e /home/stack/templates/rhel-registration/environment-rhel-registration.yaml \ [-e <environment_file>|...]
The
--update-plan-only
only updates the Overcloud plan stored in the director. Use the-e
option to include environment files relevant to your Overcloud and its update path. The order of the environment files is important as the parameters and resources defined in subsequent environment files take precedence. Use the following list as an example of the environment file order:-
Any network isolation files, including the initialization file (
environments/network-isolation.yaml
) from the heat template collection and then your custom NIC configuration file. - Any external load balancing environment files.
- Any storage environment files.
- Any environment files for Red Hat CDN or Satellite registration.
- Any other custom environment files.
-
Any network isolation files, including the initialization file (
Create a static inventory file of your overcloud:
$ tripleo-ansible-inventory --ansible_ssh_user heat-admin --static-yaml-inventory ~/inventory.yaml
If you use an overcloud name different to the default overcloud name of
overcloud
, set the name of your overcloud with the--plan
option.Create a playbook that contains a task to set the operating system version to Red Hat Enterprise Linux 7.7 on all nodes:
$ cat > ~/set_release.yaml <<'EOF' - hosts: all gather_facts: false tasks: - name: set release to 7.7 command: subscription-manager release --set=7.7 become: true EOF
Run the set_release.yaml playbook:
$ ansible-playbook -i ~/inventory.yaml -f 25 ~/set_release.yaml --limit undercloud,Controller,Compute
Use the
--limit
option to apply the content to all Red Hat OpenStack Platform nodes.Perform a package update on all nodes using the
openstack overcloud update
command:$ openstack overcloud update stack -i overcloud
The
-i
runs an interactive mode to update each node sequentially. When the update process completes a node update, the script provides a breakpoint for you to confirm. Without the-i
option, the update remains paused at the first breakpoint. Therefore, it is mandatory to include the-i
option.The script performs the following functions:
The script runs on nodes one-by-one:
- For Controller nodes, this means a full package update.
- For other nodes, this means an update of Puppet modules only.
Puppet runs on all nodes at once:
- For Controller nodes, the Puppet run synchronizes the configuration.
- For other nodes, the Puppet run updates the rest of the packages and synchronizes the configuration.
The update process starts. During this process, the director reports an
IN_PROGRESS
status and periodically prompts you to clear breakpoints. For example:starting package update on stack overcloud IN_PROGRESS IN_PROGRESS WAITING on_breakpoint: [u'overcloud-compute-0', u'overcloud-controller-2', u'overcloud-controller-1', u'overcloud-controller-0'] Breakpoint reached, continue? Regexp or Enter=proceed (will clear 49913767-e2dd-4772-b648-81e198f5ed00), no=cancel update, C-c=quit interactive mode:
Press Enter to clear the breakpoint from last node on the
on_breakpoint
list. This begins the update for that node.The script automatically predefines the update order of nodes:
- Each Controller node individually
- Each individual Compute node individually
- Each Ceph Storage node individually
- All other nodes individually
It is recommended to use this order to ensure a successful update, specifically:
- Clear the breakpoint of each Controller node individually. Each Controller node requires an individual package update in case the node’s services must restart after the update. This reduces disruption to highly available services on other Controller nodes.
- After the Controller node update, clear the breakpoints for each Compute node. You can also type a Compute node name to clear a breakpoint on a specific node or use a Python-based regular expression to clear breakpoints on multiple Compute nodes at once.
- Clear the breakpoints for each Ceph Storage nodes. You can also type a Ceph Storage node name to clear a breakpoint on a specific node or use a Python-based regular expression to clear breakpoints on multiple Ceph Storage nodes at once.
- Clear any remaining breakpoints to update the remaining nodes. You can also type a node name to clear a breakpoint on a specific node or use a Python-based regular expression to clear breakpoints on multiple nodes at once.
- Wait until all nodes have completed their update.
The update command reports a
COMPLETE
status when the update completes:... IN_PROGRESS IN_PROGRESS IN_PROGRESS COMPLETE update finished with status COMPLETE
If you configured fencing for your Controller nodes, the update process might disable it. When the update process completes, re-enable fencing with the following command on one of the Controller nodes:
$ sudo pcs property set stonith-enabled=true
The update process does not reboot any nodes in the Overcloud automatically. Updates to the kernel and other system packages require a reboot. Check the /var/log/yum.log
file on each node to see if either the kernel
or openvswitch
packages have updated their major or minor versions. If they have, reboot each node using the following procedures.
2.7. Rebooting controller and composable nodes
The following procedure reboots controller nodes and standalone nodes based on composable roles. This excludes Compute nodes and Ceph Storage nodes.
Procedure
- Log in to the node that you want to reboot.
Optional: If the node uses Pacemaker resources, stop the cluster:
[heat-admin@overcloud-controller-0 ~]$ sudo pcs cluster stop
Reboot the node:
[heat-admin@overcloud-controller-0 ~]$ sudo reboot
- Wait until the node boots.
Check the services. For example:
If the node uses Pacemaker services, check that the node has rejoined the cluster:
[heat-admin@overcloud-controller-0 ~]$ sudo pcs status
If the node uses Systemd services, check that all services are enabled:
[heat-admin@overcloud-controller-0 ~]$ sudo systemctl status
- Repeat these steps for all Controller and composable nodes.
2.8. Rebooting a Ceph Storage (OSD) cluster
The following procedure reboots a cluster of Ceph Storage (OSD) nodes.
Procedure
Log in to a Ceph MON or Controller node and disable Ceph Storage cluster rebalancing temporarily:
$ sudo ceph osd set noout $ sudo ceph osd set norebalance
- Select the first Ceph Storage node to reboot and log into it.
Reboot the node:
$ sudo reboot
- Wait until the node boots.
Log in to a Ceph MON or Controller node and check the cluster status:
$ sudo ceph -s
Check that the
pgmap
reports allpgs
as normal (active+clean
).- Log out of the Ceph MON or Controller node, reboot the next Ceph Storage node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.
When complete, log into a Ceph MON or Controller node and enable cluster rebalancing again:
$ sudo ceph osd unset noout $ sudo ceph osd unset norebalance
Perform a final status check to verify the cluster reports
HEALTH_OK
:$ sudo ceph status
2.9. Rebooting Compute nodes
Rebooting a Compute node involves the following workflow:
- Select a Compute node to reboot and disable it so that it does not provision new instances.
- Migrate the instances to another Compute node to minimise instance downtime.
- Reboot the empty Compute node and enable it.
Procedure
-
Log in to the undercloud as the
stack
user. To identify the Compute node that you intend to reboot, list all Compute nodes:
$ source ~/stackrc (undercloud) $ openstack server list --name compute
From the overcloud, select a Compute Node and disable it:
$ source ~/overcloudrc (overcloud) $ openstack compute service list (overcloud) $ openstack compute service set <hostname> nova-compute --disable
List all instances on the Compute node:
(overcloud) $ openstack server list --host <hostname> --all-projects
- Migrate your instances. For more information on migration strategies, see Migrating virtual machines between Compute nodes.
Log into the Compute Node and reboot it:
[heat-admin@overcloud-compute-0 ~]$ sudo reboot
- Wait until the node boots.
Enable the Compute node:
$ source ~/overcloudrc (overcloud) $ openstack compute service set <hostname> nova-compute --enable
Verify that the Compute node is enabled:
(overcloud) $ openstack compute service list
2.10. Verifying system packages
Before the upgrade, the undercloud node and all overcloud nodes should be using the latest versions of the following packages:
Package | Version |
---|---|
| At least 2.9 |
| At least 2.10 |
| At least 2.10 |
| At least 2.10 |
| At least 2.10 |
Procedure
- Log into a node.
Run
yum
to check the system packages:$ sudo yum list qemu-img-rhev qemu-kvm-common-rhev qemu-kvm-rhev qemu-kvm-tools-rhev openvswitch
Run
ovs-vsctl
to check the version currently running:$ sudo ovs-vsctl --version
- Repeat these steps for all nodes.
The undercloud is now uses updated OpenStack Platform 10 packages. Use the next few procedures to check the system is in a working state.
2.11. Validating an OpenStack Platform 10 undercloud
The following is a set of steps to check the functionality of your Red Hat OpenStack Platform 10 undercloud before an upgrade.
Procedure
Source the undercloud access details:
$ source ~/stackrc
Check for failed Systemd services:
$ sudo systemctl list-units --state=failed 'openstack*' 'neutron*' 'httpd' 'docker'
Check the undercloud free space:
$ df -h
Use the "Undercloud Requirements" as a basis to determine if you have adequate free space.
If you have NTP installed on the undercloud, check the clock is synchronized:
$ sudo ntpstat
Check the undercloud network services:
$ openstack network agent list
All agents should be
Alive
and their state should beUP
.Check the undercloud compute services:
$ openstack compute service list
All agents' status should be
enabled
and their state should beup
Related Information
- The following solution article shows how to remove deleted stack entries in your OpenStack Orchestration (heat) database: https://access.redhat.com/solutions/2215131
2.12. Validating an OpenStack Platform 10 overcloud
The following is a set of steps to check the functionality of your Red Hat OpenStack Platform 10 overcloud before an upgrade.
Procedure
Source the undercloud access details:
$ source ~/stackrc
Check the status of your bare metal nodes:
$ openstack baremetal node list
All nodes should have a valid power state (
on
) and maintenance mode should befalse
.Check for failed Systemd services:
$ for NODE in $(openstack server list -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo systemctl list-units --state=failed 'openstack*' 'neutron*' 'httpd' 'docker' 'ceph*'" ; done
Check the HAProxy connection to all services. Obtain the Control Plane VIP address and authentication information for the
haproxy.stats
service:$ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE sudo 'grep "listen haproxy.stats" -A 6 /etc/haproxy/haproxy.cfg'
Use the connection and authentication information obtained from the previous step to check the connection status of RHOSP services.
If SSL is not enabled, use these details in the following cURL request:
$ curl -s -u admin:<PASSWORD> "http://<IP ADDRESS>:1993/;csv" | egrep -vi "(frontend|backend)" | awk -F',' '{ print $1" "$2" "$18 }'
If SSL is enabled, use these details in the following cURL request:
curl -s -u admin:<PASSWORD> "https://<HOSTNAME>:1993/;csv" | egrep -vi "(frontend|backend)" | awk -F',' '{ print $1" "$2" "$18 }'
Replace the
<PASSWORD>
and<IP ADDRESS>
or<HOSTNAME>
values with the respective information from thehaproxy.stats
service. The resulting list shows the OpenStack Platform services on each node and their connection status.Check overcloud database replication health:
$ for NODE in $(openstack server list --name controller -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo clustercheck" ; done
Check RabbitMQ cluster health:
$ for NODE in $(openstack server list --name controller -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo rabbitmqctl node_health_check" ; done
Check Pacemaker resource health:
$ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE "sudo pcs status"
Look for:
-
All cluster nodes
online
. -
No resources
stopped
on any cluster nodes. -
No
failed
pacemaker actions.
-
All cluster nodes
Check the disk space on each overcloud node:
$ for NODE in $(openstack server list -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo df -h --output=source,fstype,avail -x overlay -x tmpfs -x devtmpfs" ; done
Check overcloud Ceph Storage cluster health. The following command runs the
ceph
tool on a Controller node to check the cluster:$ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE "sudo ceph -s"
Check Ceph Storage OSD for free space. The following command runs the
ceph
tool on a Controller node to check the free space:$ NODE=$(openstack server list --name controller-0 -f value -c Networks | cut -d= -f2); ssh heat-admin@$NODE "sudo ceph df"
ImportantThe number of placement groups (PGs) for each Ceph object storage daemon (OSD) must not exceed 250 by default. Upgrading Ceph nodes with more PGs per OSD results in a warning state and might fail the upgrade process. You can increase the number of PGs per OSD before you start the upgrade process. For more information about diagnosing and troubleshooting this issue, see the article OpenStack FFU from 10 to 13 times out because Ceph PGs allocated in one or more OSDs is higher than 250.
Check that clocks are synchronized on overcloud nodes
$ for NODE in $(openstack server list -f value -c Networks | cut -d= -f2); do echo "=== $NODE ===" ; ssh heat-admin@$NODE "sudo ntpstat" ; done
Source the overcloud access details:
$ source ~/overcloudrc
Check the overcloud network services:
$ openstack network agent list
All agents should be
Alive
and their state should beUP
.Check the overcloud compute services:
$ openstack compute service list
All agents' status should be
enabled
and their state should beup
Check the overcloud volume services:
$ openstack volume service list
All agents' status should be
enabled
and their state should beup
.
Related Information
- Review the article "How can I verify my OpenStack environment is deployed with Red Hat recommended configurations?". This article provides some information on how to check your Red Hat OpenStack Platform environment and tune the configuration to Red Hat’s recommendations.
- Review the article "Database Size Management for Red Hat Enterprise Linux OpenStack Platform" to check and clean unused database records for OpenStack Platform services on the overcloud.
2.13. Finalizing updates for NFV-enabled environments
If your environment has network function virtualization (NFV) enabled, you need to follow these steps after updating your undercloud and overcloud.
Procedure
You need to migrate your existing OVS-DPDK instances to ensure that the vhost socket mode changes from dkdpvhostuser
to dkdpvhostuserclient
mode in the OVS ports. We recommend that you snapshot existing instances and rebuild a new instance based on that snapshot image. See Manage Instance Snapshots for complete details on instance snapshots.
To snapshot an instance and boot a new instance from the snapshot:
Source the overcloud access details:
$ source ~/overcloudrc
Find the server ID for the instance you want to take a snapshot of:
$ openstack server list
Shut down the source instance before you take the snapshot to ensure that all data is flushed to disk:
$ openstack server stop SERVER_ID
Create the snapshot image of the instance:
$ openstack image create --id SERVER_ID SNAPSHOT_NAME
Boot a new instance with this snapshot image:
$ openstack server create --flavor DPDK_FLAVOR --nic net-id=DPDK_NET_ID--image SNAPSHOT_NAME INSTANCE_NAME
Optionally, verify that the new instance status is
ACTIVE
:$ openstack server list
Repeat this procedure for all instances that you need to snapshot and relaunch.
2.14. Retaining YUM history
After completing a minor update of the overcloud, retain the yum history
. This information is useful to have in case you need to undo yum transaction for any possible rollback operations.
Procedure
On each node, run the following command to save the entire
yum
history of the node in a file:$ sudo yum history list all > /home/heat-admin/$(hostname)-yum-history-all
On each node, run the following command to save the ID of the last
yum
history item:$ sudo yum history list all | head -n 5 | tail -n 1 | awk '{print $1}' > /home/heat-admin/$(hostname)-yum-history-all-last-id
- Copy these files to a secure location.
2.15. Next Steps
With the preparation stage complete, you can now perform an upgrade of the undercloud from 10 to 13 using the steps in Chapter 3, Upgrading the undercloud.