Node provisioning fails if irmc driver is used for UEFI baremetal nodes.

Solution In Progress - Updated -

Environment

  • Red Hat OpenStack Platform 17
  • iRMC driver is used
  • Baremetal nodes with UEFI boot mode

Issue

  • Provisioning nodes timeout and failes when for node using irmc drivers on UEFI baremetal nodes.

Resolution

The failure is currently tracked in private Red Hat Bugzilla.

Workaround

Add boot_mode:uefi to each baremetal nodes' properties.

$ openstack baremetal node show <node> -f json -c properties | jq -r .properties.capabilities
$ openstack baremetal node set --property capabilities="boot_mode:uefi,<capability_1>,...,<capability_n>" <node>
  • Replace <node> with the ID of the bare metal node.
  • Replace <capability_1>, and all capabilities up to <capability_n>, with each capability that you retrieved in the openstack baremetal show command.

Root Cause

In OSP17, iRMC driver, by default, uses bios command to manage servers, however this behaviour is not recognized by uefi systems.

Diagnostic Steps

  • Executing openstack overcloud node returns Timeout reached while waiting for callback for node error message
$ openstack overcloud node provision xxx.yaml

(omitted)

2022-10-21 12:13:48.079059 | 5254003a-fc0e-e6be-d9e3-00000000001a |       TASK | Provision instances
2022-10-21 12:45:11.777115 | 5254003a-fc0e-e6be-d9e3-00000000001a |      FATAL | Provision instances | localhost | error={"changed": false, "logging": "Created port Controller-ctlplane (UUID 6dfa7fe6-535d-4e24-8975-7bad6f5a2b8b) for node Controller (UUID 893c099e-eb44-45ea-888f-2ce6e5de132b) with {'network_id': '7fbb7714-b819-4523-a479-03f53111a868', 'name': 'Controller-ctlplane'}\nCreated port Compute-ctlplane (UUID cc4ab661-33a0-4b04-a56c-6fcf50bad161) for node Compute (UUID 9065f9f3-583a-42a8-86e7-c4085afeb71f) with {'network_id': '7fbb7714-b819-4523-a479-03f53111a868', 'name': 'Compute-ctlplane'}\nAttached port Controller-ctlplane (UUID 6dfa7fe6-535d-4e24-8975-7bad6f5a2b8b) to node Controller (UUID 893c099e-eb44-45ea-888f-2ce6e5de132b)\nAttached port Compute-ctlplane (UUID cc4ab661-33a0-4b04-a56c-6fcf50bad161) to node Compute (UUID 9065f9f3-583a-42a8-86e7-c4085afeb71f)\nProvisioning started on node Controller (UUID 893c099e-eb44-45ea-888f-2ce6e5de132b)\nProvisioning started on node Compute (UUID 9065f9f3-583a-42a8-86e7-c4085afeb71f)\n", "msg": "Node 9065f9f3-583a-42a8-86e7-c4085afeb71f reached failure state \"deploy failed\"; the last error is Timeout reached while waiting for callback for node 9065f9f3-583a-42a8-86e7-c4085afeb71f"}
  • The following messages are logged in the ironic-conductor.log file under /var/log/containers/ironic/ on the undercloud node. 0x00 0x08 0x05 0x80 0x04 0x00 0x00 0x00 is a command for bios system.
2022-10-21 12:14:53.602 2 DEBUG ironic.common.utils [ ... ] Execution completed, command line is "ipmitool -I lanplus -H 192.xx.xx.xx -L ADMINISTRATOR -U admin -R 1 -N 5 -f /tmp/tmp8pijkvhl raw 0x00 0x08 0x05 0x80 0x04 0x00 0x00 0x00" execute /usr/lib/python3.9/site-packages/ironic/common/utils.py:86
2022-10-21 12:14:53.784 2 DEBUG ironic.common.utils [ ... ] Execution completed, command line is "ipmitool -I lanplus -H 192.xx.xx.xx -L ADMINISTRATOR -U admin -R 1 -N 5 -f /tmp/tmpcjwvjs67 raw 0x00 0x08 0x05 0x80 0x04 0x00 0x00 0x00" execute /usr/lib/python3.9/site-packages/ironic/common/utils.py:86

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments