VM unable to boot up due to database issue
Issue
-
We're having issue that we cannot boot CSM on RHOSP16.
-
Looks there is database inconsistency.
-
After compute node reboot, VM failed to boot. we removed VM and
openstack hypervisor show overcloud-compute-0.localdomain
shows 3VM, but actually there is only 2VM. We cannot boot 3rd VM because of resource shortage. -
What we did:
1) Shutdown VMs on overcloud-compute-0
2) Planned reboot overcloud-compute-0
3) After reboot, One of VM failed to boot. VM recreation failed. Tried to recreate VM but failed because 'no valid host was found'
4) Found that hypervisor stats showing incorrect values, seems there is ghost VM occupies resource. -
Instance launch failed because cpu could not be allocated.
/var/log/containers/nova/nova-compute.log
loops over the following error:
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager Traceback (most recent call last):
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8778, in _update_available_resource_for_node
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager startup=startup)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 927, in update_available_resource
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager return f(*args, **kwargs)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 969, in _update_available_resource
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager context, instances, nodename)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1546, in _update_usage_from_instances
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager self._update_usage_from_instance(context, instance, nodename)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1510, in _update_usage_from_instance
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager nodename, sign=sign)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1319, in _update_usage
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager host_numa_topology, instance_numa_topology, free)._to_json()
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/virt/hardware.py", line 2248, in numa_usage_from_instance_numa
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager new_cell.pin_cpus(pinned_cpus)
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/objects/numa.py", line 87, in pin_cpus
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager available=list(self.pcpuset))
2024-02-16 05:37:13.206 8 ERROR nova.compute.manager nova.exception.CPUPinningUnknown: CPU set to pin [64, 65, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 55, 56, 57, 58, 59, 60, 61, 63] must be a subset of known CPU set [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87]
Environment
- Red Hat OpenStack Platform 16.1 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.