11.2. Troubleshooting Hardware Introspection

The discovery and introspection process must run to completion. However, ironic's Discovery daemon (ironic-inspector) times out after a default 1 hour period if the discovery ramdisk provides no response. Sometimes this might indicate a bug in the discovery ramdisk but usually it happens due to an environment misconfiguration, particularly BIOS boot settings.
Here are some common scenarios where environment misconfiguration occurs and advice on how to diagnose and resolve them.

Errors with Starting Node Introspection

Normally the introspection process uses the baremetal introspection, which acts an an umbrella command for ironic's services. However, if running the introspection directly with ironic-inspector, it might fail to discover nodes in the AVAILABLE state, which is meant for deployment and not for discovery. Change the node status to the MANAGEABLE state before discovery:
$ ironic node-set-provision-state [NODE UUID] manage
Then, when discovery completes, change back to AVAILABLE before provisioning:
$ ironic node-set-provision-state [NODE UUID] provide

Introspected node is not booting in PXE

Before a node reboots, ironic-inspector adds the MAC address of the node to the Undercloud firewall's ironic-inspector chain. This allows the node to boot over PXE. To verify the correct configuration, run the following command:
$ sudo iptables -L
The output should display the following chain table with the MAC address:
Chain ironic-inspector (1 references)
target     prot opt source               destination
DROP       all  --  anywhere             anywhere             MAC xx:xx:xx:xx:xx:xx
ACCEPT     all  --  anywhere             anywhere
If the MAC address is not there, the most common cause is a corruption in the ironic-inspector cache, which is in an SQLite database. To fix it, delete the SQLite file:
$ sudo rm /var/lib/ironic-inspector/inspector.sqlite
And recreate it:
$ sudo ironic-inspector-dbsync --config-file /etc/ironic-inspector/inspector.conf upgrade
$ sudo systemctl restart openstack-ironic-inspector

Stopping the Discovery Process

Currently ironic-inspector does not provide a direct means for stopping discovery. The recommended path is to wait until the process times out. If necessary, change the timeout setting in /etc/ironic-inspector/inspector.conf to change the timeout period to another period in minutes.
In worst case scenarios, you can stop discovery for all nodes using the following process:

Procedure 11.3. Stopping the Discovery Process

  1. Change the power state of each node to off:
    $ ironic node-set-power-state [NODE UUID] off
    
  2. Remove ironic-inspector cache and restart it:
    $ rm /var/lib/ironic-inspector/inspector.sqlite
    $ sudo systemctl restart openstack-ironic-inspector
    
  3. Resynchronize the ironic-inspector cache:
    $ sudo ironic-inspector-dbsync --config-file /etc/ironic-inspector/inspector.conf upgrade
    

Accessing the Introspection Ramdisk

The introspection ramdisk uses a dynamic login element. This means you can provide either a temporary password or an SSH key to access the node during introspection debugging. Use the following process to set up ramdisk access:
  1. Provide a temporary password to the openssl passwd -1 command to generate an MD5 hash. For example:
    $ openssl passwd -1 mytestpassword
    $1$enjRSyIw$/fYUpJwr6abFy/d.koRgQ/
    
  2. Edit the /httpboot/inspector.ipxe file, find the line starting with kernel, and append the rootpwd parameter and the MD5 hash. For example:
    kernel http://192.2.0.1:8088/agent.kernel ipa-inspection-callback-url=http://192.168.0.1:5050/v1/continue ipa-inspection-collectors=default,extra-hardware,logs systemd.journald.forward_to_console=yes BOOTIF=${mac} ipa-debug=1 ipa-inspection-benchmarks=cpu,mem,disk rootpwd="$1$enjRSyIw$/fYUpJwr6abFy/d.koRgQ/" selinux=0
    
    Alternatively, you can append the sshkey parameter with your public SSH key.

    Note

    Quotation marks are required for both the rootpwd and sshkey parameters.
  3. Start the introspection and find the IP address from either the arp command or the DHCP logs:
    $ arp
    $ sudo journalctl -u openstack-ironic-inspector-dnsmasq
    
  4. SSH as a root user with the temporary password or the SSH key.
    $ ssh root@192.0.2.105
    

Checking the Introspection Storage

The director uses OpenStack Object Storage (swift) to save the hardware data obtained during the introspection process. If this service is not running, the introspection can fail. Check all services related to OpenStack Object Storage to ensure the service is running:
$ sudo systemctl list-units openstack-swift*