VM unable to boot up due to database issue

Solution In Progress - Updated -

Issue

  • We're having issue that we cannot boot CSM on RHOSP16.

  • Looks there is database inconsistency.

  • After compute node reboot, VM failed to boot. we removed VM and openstack hypervisor show overcloud-compute-0.localdomain shows 3VM, but actually there is only 2VM. We cannot boot 3rd VM because of resource shortage.

  • What we did:
    1) Shutdown VMs on overcloud-compute-0
    2) Planned reboot overcloud-compute-0
    3) After reboot, One of VM failed to boot. VM recreation failed. Tried to recreate VM but failed because 'no valid host was found'
    4) Found that hypervisor stats showing incorrect values, seems there is ghost VM occupies resource.

  • Instance launch failed because cpu could not be allocated. /var/log/containers/nova/nova-compute.log loops over the following error:

2024-02-16 05:37:13.206 8 ERROR nova.compute.manager Traceback (most recent call last):
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8778, in _update_available_resource_for_node
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     startup=startup)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 927, in update_available_resource
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     return f(*args, **kwargs)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 969, in _update_available_resource
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     context, instances, nodename)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1546, in _update_usage_from_instances
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     self._update_usage_from_instance(context, instance, nodename)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1510, in _update_usage_from_instance
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     nodename, sign=sign)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1319, in _update_usage
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     host_numa_topology, instance_numa_topology, free)._to_json()
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/virt/hardware.py", line 2248, in numa_usage_from_instance_numa
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     new_cell.pin_cpus(pinned_cpus)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager   File "/usr/lib/python3.6/site-packages/nova/objects/numa.py", line 87, in pin_cpus
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager     available=list(self.pcpuset))
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager nova.exception.CPUPinningUnknown: CPU set to pin [64, 65, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 55, 56, 57, 58, 59, 60, 61, 63] must be a subset of known CPU set [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87]

Environment

  • Red Hat OpenStack Platform 16.1 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content