Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 4. Troubleshooting Director-Based Upgrades

This section provides advice for troubleshooting issues with both the undercloud and overcloud.

4.1. Undercloud Upgrades

In situations where an Undercloud upgrade command (openstack undercloud upgrade) fails, use the following advice to locate the issue blocking upgrade progress:

  • The openstack undercloud upgrade command prints out a progress log while it runs and saves it to .instack/install-undercloud.log. If an error occurs at any point in the upgrade process, the command halts at the point of error. Use this information to identify any issues impeding upgrade progress.
  • The openstack undercloud upgrade command runs Puppet to configure Undercloud services. This generates useful Puppet reports in the following directories:

    • /var/lib/puppet/state/last_run_report.yaml - The last Puppet reports generated for the Undercloud. This file shows any causes of failed Puppet actions.
    • /var/lib/puppet/state/last_run_summary.yaml - A summary of the last_run_report.yaml file.
    • /var/lib/puppet/reports - All Puppet reports for the Undercloud.

      Use this information to identify any issues impeding upgrade progress.

  • Check for any failed services:

    $ sudo systemctl -t service

    If any services have failed, check their corresponding logs. For example, if openstack-ironic-api failed, use the following commands to check the logs for that service:

    $ sudo journalctl -xe -u openstack-ironic-api
    $ sudo tail -n 50 /var/log/ironic/ironic-api.log

After correcting the issue impeding the Undercloud upgrade, rerun the upgrade command:

$ openstack undercloud upgrade

The upgrade command begins again and configures the Undercloud.

4.2. Overcloud Upgrades

In situations where an Overcloud upgrade process fails, use the following advice to locate the issue blocking upgrade progress:

  • Check the stack listing and identify any stacks that have an UPDATE_FAILED status. The following command identifies failed stacks:

    $ openstack stack failures list overcloud

    View the failed stacks and its template to identify how the stack failed:

    $ openstack stack show overcloud-Controller-qyoy54dyhrll-1-gtwy5bgta3np
    $ openstack stack template show overcloud-Controller-qyoy54dyhrll-1-gtwy5bgta3np
  • Check that Pacemaker is running correctly on all Controller nodes. If necessary, log into a Controller node and restart the Controller cluster:

    $ sudo pcs cluster start
  • Check the configuration log files for any failures. The /var/run/heat-config/deployed/ directory on each node contains these logs. These files are named in date order and are separated into standard output (*-stdout.log) and error output (*-stderr.log).
Note

The director performs a set of validation checks before the upgrade process to make sure the overcloud is in a good state. If the upgrade has failed and you want to retry, you might need to disable these validation checks. To disable these checks, temporarily add the SkipUpgradeConfigTags: [validation] to the parameter_defaults section of an environment file included with your overcloud.

After correcting the issue impeding the Overcloud upgrade, check that no resources have an IN_PROGRESS status:

$ openstack stack resource list overcloud -n5 --filter status='*IN_PROGRESS'

If any resources have an IN_PROGRESS status, wait until they either complete or fail.

Rerun the openstack overcloud deploy command for the failed upgrade step you attempted. This following is an example of the first openstack overcloud deploy command in the upgrade process, which includes the major-upgrade-composable-steps.yaml:

$ openstack overcloud deploy --templates \
  --control-scale 3 \
  --compute-scale 3 \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml \
  -e network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps.yaml \
  --ntp-server pool.ntp.org

The openstack overcloud deploy retries the Overcloud stack update.