Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 7. Troubleshooting Director-Based Upgrades

This section provides advice for troubleshooting issues with both.

7.1. Undercloud Upgrades

In situations where an Undercloud upgrade command (openstack undercloud upgrade) fails, use the following advice to locate the issue blocking upgrade progress:

  • The openstack undercloud upgrade command prints out a progress log while it runs. If an error occurs at any point in the upgrade process, the command halts at the point of error. Use this information to identify any issues impeding upgrade progress.
  • The openstack undercloud upgrade command runs Puppet to configure Undercloud services. This generates useful Puppet reports in the following directories:

    • /var/lib/puppet/state/last_run_report.yaml - The last Puppet reports generated for the Undercloud. This file shows any causes of failed Puppet actions.
    • /var/lib/puppet/state/last_run_summary.yaml - A summary of the last_run_report.yaml file.
    • /var/lib/puppet/reports - All Puppet reports for the Undercloud.

      Use this information to identify any issues impeding upgrade progress.

  • Check for any failed services:

    $ sudo systemctl -t service

    If any services have failed, check their corresponding logs. For example, if openstack-ironic-api failed, use the following commands to check the logs for that service:

    $ sudo journalctl -xe -u openstack-ironic-api
    $ sudo tail -n 50 /var/log/ironic/ironic-api.log

After correcting the issue impeding the Undercloud upgrade, rerun the upgrade command:

$ openstack undercloud upgrade

The upgrade command begins again and configures the Undercloud.

7.2. Overcloud Upgrades

In situations where an Overcloud upgrade process fails, use the following advice to locate the issue blocking upgrade progress:

  • Check the Heat stack listing and identify any stacks that have an UPDATE_FAILED status. The following command identifies these stacks:

    $ heat stack-list --show-nested | awk -F "|" '{ print $3,$4 }' | grep "UPDATE_FAILED" | column -t

    View the failed stack and its template to identify how the stack failed:

    $ heat stack-show overcloud-Controller-qyoy54dyhrll-1-gtwy5bgta3np
    $ heat template-show overcloud-Controller-qyoy54dyhrll-1-gtwy5bgta3np
  • Check that Pacemaker is running correctly on all Controller nodes. If necessary, log into a Controller node and restart the Controller cluster:

    $ sudo pcs cluster start

After correcting the issue impeding the Overcloud upgrade, rerun the openstack overcloud deploy command for the failed upgrade step you attempted. This following is an example of the first openstack overcloud deploy command in the upgrade process, which includes the major-upgrade-pacemaker-init.yaml:

$ openstack overcloud deploy --templates \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml  \
  -e network_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-init.yaml

The openstack overcloud deploy retries the Overcloud stack update.