Chapter 4. Debugging recommendations and known issues
Review the following section for debugging suggestions that can help you troubleshoot your deployment.
4.1. Known issues
The following list outlines existing current limitations.
- BZ#1857451 - Ansible forks value should have an upper limit and Current Calculation needs to change
-
By default, the Ansible playbooks in mistral are configured to use
10*CPU_COUNT
forks in theansible.cfg
file. When you do not use the--limit
option to limit the Ansible execution to a specific node or set of nodes and the Ansible execution is set to run on all of the existing nodes, Ansible consumes almost 100% of memory utilisation.
4.2. Introspection debugging
Review the following list of recommendations when you debug introspection.
- Check your introspection DHCP range and NICs in your
undercloud.conf
file -
If any of these values are incorrect, fix them, and rerun the
openstack undercloud install
command. - Ensure that you do not try to introspect more than your DHCP range of nodes can allow
- The DHCP lease for each node continues to be active for approximately two minutes after introspection finishes.
- Ensure that target nodes are responsive
- If all nodes fail introspection, ensure that you can ping target nodes over the native VLAN by using the configured NIC and that the out-of-band interface credentials and addresses are correct.
- Check the introspection commands in the console
- For debugging specific nodes, watch the console when the node boots and observe introspection commands to the node. If the node stops before it completes the PXE process, check the connectivity, IP allocation, and the network load. When a node exits the BIOS and boots the introspection image, failures are rare and almost exclusively related to connectivity issues. Ensure that the heartbeat from the introspection image is not interrupted on its way to the undercloud.
4.3. Deployment debugging
Use the following recommendations when you debug a deployment.
- Inspect the DHCP servers that provide addresses on the provisioning network
Any additional DHCP servers that supply addresses on the provisioning network can prevent Red Hat OpenStack Platform director from inspecting and provisioning machines.
For DHCP or PXE introspection issues, enter the following command:
$ sudo tcpdump -i any port 67 or port 68 or port 69
For DHCP or PXE deployment issues, enter the following command:
$ sudo ip netns exec qdhcp tcpdump -i <interface> port 67 or port 68 or port 69
- Check the state of your failed or foreign disks
-
For failed or foreign disks, check the state of your disks to ensure that, according to the out-of-band management of the machine, the state of the failed or foreign disks is set to
Up
. Disks can exit theUp
state during a deployment cycle and change the order that your disks appear in the base operating system. - Use the following commands to debug failed overcloud deployments
-
openstack stack failures list overcloud
-
heat resource-list -n5 overcloud | grep -i fail
-
less /var/lib/mistral/config-download-latest/ansible.log
To review the output of the commands, log in to the node where the failure occurs and review the log files in
/var/log/
and/var/log/containers/
.-