When custom roles files get missed on update-plan-only step, minor update fails and returns the nodes to the pool

Solution Verified - Updated -

Issue

We deploy the Overcloud normally on 3 Controller and 6 Compute Nodes, with the Compute Nodes having 2 different Composable Roles.
After the initial deployment, we tried to update the overcloud via openstack overcloud update as explained in the product documentation

The Update went through the first 2 Controller Nodes and then fails after clearing the 3rd with:

[stack]$ ./overcloud-update.sh 
Started Mistral Workflow tripleo.package_update.v1.package_update_plan. Execution ID: 17716474-2106-4469-8102-a45c929961a5
Waiting for messages on queue 'ac38ee4a-d63a-4d34-ae10-0c77147056e4' with no timeout.
WAITING
not_started: [u'overcloud-computelenovo-4', u'overcloud-computelenovo-2', u'overcloud-computelenovo-3', u'overcloud-computelenovo-1', u'overcloud-computelenovo-0', u'overcloud-computehp_v1-0']
on_breakpoint: [u'overcloud-controller-1', u'overcloud-controller-2', u'overcloud-controller-0']
Breakpoint reached, continue? Regexp or Enter=proceed (will clear 6ce54680-07d3-4910-8745-921adc280095), C-c=quit interactive mode: 
WAITING
not_started: [u'overcloud-computelenovo-4', u'overcloud-computelenovo-2', u'overcloud-computelenovo-3', u'overcloud-computelenovo-1', u'overcloud-computelenovo-0', u'overcloud-computehp_v1-0']
completed: [u'overcloud-controller-0']
on_breakpoint: [u'overcloud-controller-1', u'overcloud-controller-2']
Breakpoint reached, continue? Regexp or Enter=proceed (will clear 46766fb0-7c53-45ec-8f66-2ccdf3faf42d), C-c=quit interactive mode: 
WAITING
not_started: [u'overcloud-computelenovo-4', u'overcloud-computelenovo-2', u'overcloud-computelenovo-3', u'overcloud-computelenovo-1', u'overcloud-computelenovo-0', u'overcloud-computehp_v1-0']
completed: [u'overcloud-controller-2', u'overcloud-controller-0']
on_breakpoint: [u'overcloud-controller-1']
Breakpoint reached, continue? Regexp or Enter=proceed (will clear 0923044e-a1be-48e5-9b5b-f42eacf34b74), C-c=quit interactive mode: 
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
IN_PROGRESS
ERROR: The specified reference "ComputeLenovoDeployment_Step5" (in ComputeExtraConfigPost) is incorrect.

After that all Compute Nodes are instantly removed and shown up as available in "openstack baremetal node list". The Complete Stack needs to be redeployed after that.

Environment

  • Red Hat OpenStack Platform 11.0

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content