CPU pinning overlap after live or offline migration in Red Hat OpenStack Platform 13
Issue
- Multiple instances on the same host are scheduled to use the same pCPUs, even though
hw:cpu_policy='dedicated'
is set – and consequently, CPU steal time occurs. See below, instances 97,99,105, for example.
[root@overcloud-compute-0 ~]# ./get_instance_cpu_placement.sh
Instance 15 CPU placement:
8,59,68,78,0
<vcpupin vcpu='0' cpuset='8'/>, <vcpupin vcpu='1' cpuset='59'/>, <vcpupin vcpu='2' cpuset='68'/>, <vcpupin vcpu='3' cpuset='78'/>, <emulatorpin cpuset='0'/>
Instance 70 CPU placement:
44,54,29,76,11
<vcpupin vcpu='0' cpuset='44'/>, <vcpupin vcpu='1' cpuset='54'/>, <vcpupin vcpu='2' cpuset='29'/>, <vcpupin vcpu='3' cpuset='76'/>, <emulatorpin cpuset='11'/>
Instance 71 CPU placement:
6,17,24,39,14
<vcpupin vcpu='0' cpuset='6'/>, <vcpupin vcpu='1' cpuset='17'/>, <vcpupin vcpu='2' cpuset='24'/>, <vcpupin vcpu='3' cpuset='39'/>, <emulatorpin cpuset='14'/>
Instance 73 CPU placement:
2,13,67,32,18
<vcpupin vcpu='0' cpuset='2'/>, <vcpupin vcpu='1' cpuset='13'/>, <vcpupin vcpu='2' cpuset='67'/>, <vcpupin vcpu='3' cpuset='32'/>, <emulatorpin cpuset='18'/>
Instance 86 CPU placement:
1,41,46,6,2,42,47,7,55,15,17,57,13,53,16,56,19
<vcpupin vcpu='0' cpuset='1'/>, <vcpupin vcpu='1' cpuset='41'/>, <vcpupin vcpu='2' cpuset='46'/>, <vcpupin vcpu='3' cpuset='6'/>, <vcpupin vcpu='4' cpuset='2'/>, <vcpupin vcpu='5' cpuset='42'/>, <vcpupin vcpu='6' cpuset='47'/>, <vcpupin vcpu='7' cpuset='7'/>, <vcpupin vcpu='8' cpuset='55'/>, <vcpupin vcpu='9' cpuset='15'/>, <vcpupin vcpu='10' cpuset='17'/>, <vcpupin vcpu='11' cpuset='57'/>, <vcpupin vcpu='12' cpuset='13'/>, <vcpupin vcpu='13' cpuset='53'/>, <vcpupin vcpu='14' cpuset='16'/>, <vcpupin vcpu='15' cpuset='56'/>, <emulatorpin cpuset='19'/>
Instance 97 CPU placement:
9,12,26,35,20
<vcpupin vcpu='0' cpuset='9'/>, <vcpupin vcpu='1' cpuset='12'/>, <vcpupin vcpu='2' cpuset='26'/>, <vcpupin vcpu='3' cpuset='35'/>, <emulatorpin cpuset='20'/>
Instance 99 CPU placement:
9,12,26,35,21
<vcpupin vcpu='0' cpuset='9'/>, <vcpupin vcpu='1' cpuset='12'/>, <vcpupin vcpu='2' cpuset='26'/>, <vcpupin vcpu='3' cpuset='35'/>, <emulatorpin cpuset='21'/>
Instance 105 CPU placement:
9,12,26,35,22
<vcpupin vcpu='0' cpuset='9'/>, <vcpupin vcpu='1' cpuset='12'/>, <vcpupin vcpu='2' cpuset='26'/>, <vcpupin vcpu='3' cpuset='35'/>, <emulatorpin cpuset='22'/>
- Here are the flavors being used:
(overcloud) [stack@undercloud ~]$ openstack flavor show flavor_one
+----------------------------+---------------------------------------------------------------------------------------+
| Field | Value |
+----------------------------+---------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| disk | 80 |
| id | 96c2631c-f54c-4018-9d93-4eff9bd2c232 |
| name | flavor_one |
| os-flavor-access:is_public | True |
| properties | hw:cpu_policy='dedicated', hw:numa_nodes='4', hw:pci_numa_affinity_policy='preferred' |
| ram | 8192 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 4 |
+----------------------------+---------------------------------------------------------------------------------------+
(overcloud) [stack@undercloud ~]$ openstack flavor show flavor_two
+----------------------------+---------------------------------------------------------------------------------------+
| Field | Value |
+----------------------------+---------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| disk | 80 |
| id | 18a35106-500d-4939-b0c3-5c922eb40234 |
| name | flavor_two |
| os-flavor-access:is_public | True |
| properties | hw:cpu_policy='dedicated', hw:numa_nodes='4', hw:pci_numa_affinity_policy='preferred' |
| ram | 16384 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 4 |
+----------------------------+---------------------------------------------------------------------------------------+
/var/log/containers/nova/nova-compute.log
contains errors similar to this one:
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager [req-c7bef2a3-8f36-4a0b-a922-5e17428459bb - - - - -] Error updating resources for node overcloud-compute-0: CPUPinningInvalid: CPU set to pin [6] must be a subset of free CPU set [3, 4, 5, 9, 43, 45, 48, 49]
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager Traceback (most recent call last):
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7426, in update_available_resource_for_node
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 689, in update_available_resource
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager self._update_available_resource(context, resources)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager return f(*args, **kwargs)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 730, in _update_available_resource
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager self._update_usage_from_instances(context, instances, nodename)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 1233, in _update_usage_from_instances
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager require_allocation_refresh=require_allocation_refresh)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 1126, in _update_usage_from_instance
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager sign=sign)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 937, in _update_usage
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager cn, usage, free)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 1868, in get_host_numa_usage_from_instance
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager host_numa_topology, instance_numa_topology, free=free))
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 1724, in numa_usage_from_instances
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager newcell.pin_cpus(pinned_cpus)
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/objects/numa.py", line 91, in pin_cpus
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager self.pinned_cpus))
2019-10-15 00:00:41.720 1 ERROR nova.compute.manager CPUPinningInvalid: CPU set to pin [6] must be a subset of free CPU set [3, 4, 5, 9, 43, 45, 48, 49]
Environment
- Red Hat OpenStack Platform 13.0 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.