Deployments failing while updating nova ownership

Solution In Progress - Updated -

Issue

  • During an overcloud deployment the deployment frequently fails with an error No such file or directory:
openstack-overcloud.AllNodesDeploySteps.Compute03Deployment_Step3.20:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: f8cb2e2c-54e5-4d9a-ad75-57cc0a9247f3
  status: UPDATE_FAILED
  status_reason: |
    Error: resources[20]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
            "  File \"/docker-config-scripts/nova_statedir_ownership.py\", line 126, in _walk",
            "    for f in os.listdir(top):",
            "OSError: [Errno 2] No such file or directory: '/var/lib/nova/instances/b8d64704-d120-4f92-99b3-7b954056d8d2'"
        ]
    }
        to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/0dba793d-7677-4127-9205-8a8b92081039_playbook.retry

    PLAY RECAP *********************************************************************
    localhost                  : ok=5    changed=2    unreachable=0    failed=1
  • This appears to be a race condition between when the nova_statedir_ownership.py gathers the list of instances and when it runs the ownership change. We are expecting the deployments to complete. Is there a way to skip this script or fix it to not error when a file is not found. These are very active fabrics and this error occurs frequently on deployments of the overcloud.

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content