Additional log details to troubleshoot the failure overcloud stack
Environment
- Red Hat OpenStack Platform (RHOSP) 13, 14
Issue
- How to collect additional information which requires to troubleshoot the RHOSP 13 deployment?
Resolution
Note: For RHOSP 14 specific log refer to the following article: [RHOSP14] Additional log details to troubleshoot the failure overcloud stack
Refer to Director Deployment Guide - Troubleshooting Director Issues to troubleshoot the RHOSP13 deployment.
-
To review overcloud nodes with its associated resources, execute the script below which extracts the required details to
xxx_xxx_Overcloud_details.tar.gz
and uploads thetar
file to the support case which has been opened for Red Hat OpenStack:cat<<'EOF'>hypervisor.sh #!/bin/bash FS=$'\n' ; dir_name=/tmp/OSP13 rm -rf ${dir_name}/* mkdir -p ${dir_name} source /home/stack/stackrc stackname=`openstack stack list -c "Stack Name" -f value` director() { source /home/stack/stackrc echo -e "\n\n[stack@`hostname`~]$ $1 " | tee -a ${dir_name}/`hostname`_env.log eval "$1" | tee -a ${dir_name}/`hostname`_env.log } stack() { echo -e "\n\n[stack@`hostname`~]$ $1 " | tee -a ${dir_name}/`hostname`_OC.log eval "$1" | tee -a ${dir_name}/`hostname`_OC.log } os-collect() { source /home/stack/stackrc for i in `openstack server list -c Networks -f value | cut -d \= -f 2` do host_name=`eval 'ssh -l heat-admin '$i' sudo hostname -f'` echo -e "\n\n[heat-admin@'$host_name'~]$ sudo journalctl -u os-collect-config" | tee -a ${dir_name}/$host_name-journal.log eval 'ssh -l heat-admin '$i' sudo journalctl -u os-collect-config' | tee -a ${dir_name}/$host_name-journal.log echo -e "\n\n[heat-admin@'$host_name'~]$ sudo os-collect-config --debug --print" | tee -a ${dir_name}/$host_name-os-collect-config.log eval 'ssh -l heat-admin '$i' sudo os-collect-config --debug --print' | tee -a ${dir_name}/$host_name-os-collect-config.log eval 'ssh -l heat-admin '$i' sudo tar -czf - /etc/puppet/' > ${dir_name}/$host_name-puppet.tar.gz done } oc-list(){ source /home/stack/stackrc echo -e "\n\n[stack@`hostname`~]$ openstack stack environment show $stackname"; eval 'openstack stack environment show $stackname' | tee -a ${dir_name}/`hostname`_oc_env.log echo -e "\n\n[stack@`hostname`~]$ openstack stack file list $stackname --fit-width" eval 'openstack stack file list $stackname --fit-width' | tee -a ${dir_name}/`hostname`_oc_file_list.log echo -e "\n\n[stack@`hostname`~]$ openstack stack output list $stackname" | tee -a ${dir_name}/`hostname`_oc_output_list.log eval 'openstack stack output list $stackname' | tee -a ${dir_name}/`hostname`_oc_output_list.log openstack stack output list $stackname -c output_key -f value | while read line; do echo -e "\n\n[stack@`hostname`~]$ openstack stack output show $stackname $line --fit-width" | tee -a ${dir_name}/`hostname`_oc_output_list.log; eval 'openstack stack output show $stackname $line --fit-width' | tee -a ${dir_name}/`hostname`_oc_output_list.log; done } os-collect; oc-list; director "openstack stack list" director "openstack server list" director "openstack hypervisor list" director "openstack baremetal node list" director "openstack hypervisor list" director "openstack flavor list --long" director "openstack image list --long" director "openstack overcloud profiles list" for i in `openstack hypervisor list -c "Hypervisor Hostname" -f value`; do director "openstack hypervisor show $i"; done for i in `openstack baremetal node list -c UUID -f value`; do director "openstack baremetal introspection status $i"; director "openstack baremetal node show $i"; director "openstack baremetal introspection interface list $i"; director "openstack baremetal introspection data save $i --file /tmp/OSP13/$i.log"; director "openstack baremetal port list --node $i"; director "openstack baremetal port list --node $i -c UUID -f value | xargs -I {} openstack baremetal port show {}"; done stack "openstack stack event list --nested-depth 5 $stackname" stack "openstack stack failures list $stackname --long" stack "openstack stack resource list -n 5 --filter status=CREATE_FAILED $stackname" stack "openstack stack resource list -n 5 --filter status=IN_PROGRESS $stackname" stack "openstack stack resource list --filter status=CREATE_COMPLETE $stackname" stack "openstack stack list --nested --fit-width" openstack stack resource list -n 5 --filter status=CREATE_FAILED $stackname -c resource_name -f value | grep [a-zA-Z] | sort --unique | while read line; do stack "openstack stack resource show --fit-width $stackname $line" done openstack stack resource list -n 5 $stackname | grep -i deployment | awk '{print $4}' |grep -v $stackname | while read line; do stack "openstack software deployment show $line" stack "openstack software deployment output show --all --long $line" done stack 'openstack stack hook poll --nested-depth 5 --fit-width $stackname' archive_name="`hostname`_`date '+%F_%H%m%S'`_Overcloud_details.tar.gz"; tar -czf $archive_name ${dir_name}; echo "Archived all data in archive ${archive_name}"; EOF
-
Execute the script:
$ chmod +x hypervisor.sh $ bash hypervisor.sh
Diagnostic Steps
- The OpenStack Workflow (mistral) service groups multiple OpenStack tasks into workflows. It uses a set of these workflows to perform common functions across the CLI and web UI.
- This includes bare metal node control, validations, plan management, and overcloud deployment.
-
OpenStack Workflow and action execution also provide robust logging of executions, To diagnosis, the workflow and action details, execute the script below and review the details in
/tmp/xxx_workflow_xxx.log
:cat<<'EOF'>mistral_action_troubleshoot.sh #!/bin/bash FS=$'\n' ; file=/tmp/`hostname`_workflow_`date '+%F_%H%m%S'`.log source /home/stack/stackrc stackname=`openstack stack list -c "Stack Name" -f value` echofun() { echo -e "\n\n$1" | tee -a $file } workflow(){ echofun "[stack@`hostname`~]$ $1 " eval "$1" | tee -a $file } workflow "openstack workflow execution list --fit-width" for i in `openstack workflow execution list | grep "ERROR" | awk '{print $2}'`; do workflow "openstack workflow execution show --fit-width $i"; workflow "openstack workflow execution output show $i"; done workflow "openstack action execution list --fit-width" for i in `openstack action execution list -c ID -f value` do workflow "openstack action execution show $i --fit-width" workflow "openstack action execution output show $i" done EOF
-
Execute the script:
$ chmod +x mistral_action_troubleshoot.sh $ bash mistral_action_troubleshoot.sh
-
If this is for a case which was opened to Red Hat support, attach the resulting
tar
archive to the case. -
Generate and provide a sosreport of the node which has impacted:
sosreport -a --batch --all-logs
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments