VM scheduling failure - NUMAAffinityFilter

Issue

Overcloud was successfully redeployed. Now we're encountering an issue where until about 70% of CPU pCores allocation, in our "performance" test scenario (SRIOV, CPU pinning, NUMA affinity) - everything seems to work fine. Right above that, VM scheduling is starting to fail by the NUMATopologyFIlter, saying that no more resources are available on the current host. This is obviously false - free physical resources are available.

(overcloud) [stack@undercloud ~]$ openstack flavor show be6b1e46-ceb8-450d-be61-66b937ec2c2c
+----------------------------+-----------------------------------------------------------------------------------------------------------------------------+
| Field                      | Value                                                                                                                       |
+----------------------------+-----------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                                       |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                           |
| access_project_ids         | None                                                                                                                        |
| disk                       | 40                                                                                                                          |
| id                         | be6b1e46-ceb8-450d-be61-66b937ec2c2c                                                                                        |
| name                       | m1.medium.corepin.numapin                                                                                                   |
| os-flavor-access:is_public | True                                                                                                                        |
| properties                 | hw:cpu_policy='dedicated', hw:emulator_threads_policy='isolate', hw:numa_nodes='4', hw:pci_numa_affinity_policy='preferred' |
| ram                        | 4096                                                                                                                        |
| rxtx_factor                | 1.0                                                                                                                         |
| swap                       |                                                                                                                             |
| vcpus                      | 2                                                                                                                           |
+----------------------------+-----------------------------------------------------------------------------------------------------------------------------+

VMs are now failing to be scheduled even though we have remaining memory/cpus cores available:

| fault                               | {u'message': u'Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance b88d23bd-61bf-4bb4-b201-8e2ae0b139e4. Last exception: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the g', u'code': 500, u'details': u'Traceback (most recent call last):\n  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 604, in build_instances\n    filter_properties, instances[0].uuid)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line 557, in populate_retry\n    raise exception.MaxRetriesExceeded(reason=msg)\nMaxRetriesExceeded: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance b88d23bd-61bf-4bb4-b201-8e2ae0b139e4. Last exception: Insufficient comput
e resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology.\n', u'created': u'2019-12-31T15:48:56Z'} |                                              
| flavor                              | m1.medium.corepin.numapin

Environment

Red Hat OpenStack Platform 16.0 (RHOSP
Red Hat OpenStack Platform 15.0 (RHOSP
Red Hat OpenStack Platform 14.0 (RHOSP
Red Hat OpenStack Platform 13.0 (RHOSP

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

VM scheduling failure - NUMAAffinityFilter

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links