Changed NUMA topology on compute but Openstack doesn't seem to see the update
Issue
-
We onboarded a compute in our cloud with a different numa topology. The HPE "sub-numa clustering" BIOS setting was enabled while we disable it usually. We had 4 numa_node on this compute.
-
We disabled the setting in the BIOS and rebooted. While we saw in
lscpu
only 2 NUMA nodes, it looks like the setting was not applied in Openstack : -
A bad node reports the following pci devices:
2020-08-19 06:37:11.698 1 INFO nova.compute.resource_tracker [req-1389df4c-80e3-4c52-b4f2-2338c8e83aa3 - - - - -] Final resource view: name=openstack-compute-0.localdomain phys_ram=589352MB used_ram=32768MB phys_disk=1787GB used_disk=244GB total_vcpus=68 used_vcpus=0 pci_stats=[PciDevicePool(count=1,numa_node=0,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-0-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=1,numa_node=0,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-0-2',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=0,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-0-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=0,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-0-2',trusted='true'},vendor_id='8086'), PciDevicePool(count=1,numa_node=3,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-1-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=1,numa_node=3,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-1-2',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=3,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-1-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=3,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-1-2',trusted='true'},vendor_id='8086')]
- A good node reports the following pci devices:
2020-08-19 06:39:46.459 1 INFO nova.compute.resource_tracker [req-5c8ddade-2da2-4b60-8009-6b0df0885815 - - - - -] Final resource view: name=openstack-compute-1.localdomain phys_ram=589337MB used_ram=32768MB phys_disk=1787GB used_disk=244GB total_vcpus=68 used_vcpus=0 pci_stats=[PciDevicePool(count=1,numa_node=0,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-0-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=1,numa_node=0,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-0-2',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=0,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-0-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=0,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-0-2',trusted='true'},vendor_id='8086'), PciDevicePool(count=1,numa_node=1,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-1-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=1,numa_node=1,product_id='1572',tags={dev_type='type-PF',physical_network='sriov-1-2',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=1,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-1-1',trusted='true'},vendor_id='8086'), PciDevicePool(count=20,numa_node=1,product_id='154c',tags={dev_type='type-VF',physical_network='sriov-1-2',trusted='true'},vendor_id='8086')]
- This is preventing the tenants from spinning VMs on this compute because the NUMAtTopologyFilters is ignoring this compute.
Environment
- Red Hat OpenStack Platform 13.0 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.