Chapter 4. Troubleshooting Director-Based Upgrades
This section provides advice for troubleshooting issues with both the undercloud and overcloud.
4.1. Undercloud Upgrades
In situations where an Undercloud upgrade command (
openstack undercloud upgrade) fails, use the following advice to locate the issue blocking upgrade progress:
openstack undercloud upgradecommand prints out a progress log while it runs and saves it to
.instack/install-undercloud.log. If an error occurs at any point in the upgrade process, the command halts at the point of error. Use this information to identify any issues impeding upgrade progress.
openstack undercloud upgradecommand runs Puppet to configure Undercloud services. This generates useful Puppet reports in the following directories:
/var/lib/puppet/state/last_run_report.yaml- The last Puppet reports generated for the Undercloud. This file shows any causes of failed Puppet actions.
/var/lib/puppet/state/last_run_summary.yaml- A summary of the
/var/lib/puppet/reports- All Puppet reports for the Undercloud.
Use this information to identify any issues impeding upgrade progress.
Check for any failed services:
$ sudo systemctl -t service
If any services have failed, check their corresponding logs. For example, if
openstack-ironic-apifailed, use the following commands to check the logs for that service:
$ sudo journalctl -xe -u openstack-ironic-api $ sudo tail -n 50 /var/log/ironic/ironic-api.log
After correcting the issue impeding the Undercloud upgrade, rerun the upgrade command:
$ openstack undercloud upgrade
The upgrade command begins again and configures the Undercloud.
4.2. Overcloud Upgrades
In situations where an Overcloud upgrade process fails, use the following advice to locate the issue blocking upgrade progress:
Check the stack listing and identify any stacks that have an
UPDATE_FAILEDstatus. The following command identifies failed stacks:
$ openstack stack failures list overcloud
View the failed stacks and its template to identify how the stack failed:
$ openstack stack show overcloud-Controller-qyoy54dyhrll-1-gtwy5bgta3np $ openstack stack template show overcloud-Controller-qyoy54dyhrll-1-gtwy5bgta3np
Check that Pacemaker is running correctly on all Controller nodes. If necessary, log into a Controller node and restart the Controller cluster:
$ sudo pcs cluster start
Check the configuration log files for any failures. The
/var/run/heat-config/deployed/directory on each node contains these logs. These files are named in date order and are separated into standard output (
*-stdout.log) and error output (
The director performs a set of validation checks before the upgrade process to make sure the overcloud is in a good state. If the upgrade has failed and you want to retry, you might need to disable these validation checks. To disable these checks, temporarily add the
SkipUpgradeConfigTags: [validation] to the
parameter_defaults section of an environment file included with your overcloud.
After correcting the issue impeding the Overcloud upgrade, check that no resources have an
$ openstack stack resource list overcloud -n5 --filter status='*IN_PROGRESS'
If any resources have an
IN_PROGRESS status, wait until they either complete or fail.
openstack overcloud deploy command for the failed upgrade step you attempted. This following is an example of the first
openstack overcloud deploy command in the upgrade process, which includes the
$ openstack overcloud deploy --templates \ --control-scale 3 \ --compute-scale 3 \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml \ -e network_env.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps.yaml \ --ntp-server pool.ntp.org
openstack overcloud deploy retries the Overcloud stack update.