Starting instances fail while sriov card is on a different numa node

Solution In Progress - Updated -

Issue

  • Instances are pinned to cpus from numa node 1 with an SRIOV card as vnic which is attached to numa node 0. The instance fails to start.
# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
node 0 size: 32722 MB
node 0 free: 18606 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
node 1 size: 32768 MB
node 1 free: 30436 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10 

[root@dell-per720-06 ~(keystone_admin)]# cat /sys/class/net/p6p1/device/numa_node
0

[root@dell-per720-06 ~(keystone_admin)]# cat /sys/class/net/p6p2/device/numa_node
0

[root@dell-per720-06 ~(keystone_admin)]# egrep -i vcpu_pin /etc/nova/nova.conf  |egrep -v '#'
vcpu_pin_set=1,3,5,7,9,11,13,15,17,19,21,23

[root@dell-per720-06 ~(keystone_admin)]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-327.22.2.el7.x86_64 root=/dev/mapper/rhel_dell--per720--06-root ro crashkernel=auto rd.lvm.lv=rhel_dell-per720-06/root rd.lvm.lv=rhel_dell-per720-06/swap console=ttyS0,115200n81 LANG=en_US.UTF-8 intel_iommu=on isolcpus=1,3,5,7,9,11,13,15,17,19,21,23

###### With CPU pinning, with sriov port

[root@dell-per720-06 ~(keystone_admin)]# nova boot --flavor 7 --image d6ef04ca-d6a4-4b4c-bb9c-34982a433f0d   --nic port-id=6461de7b-4b90-433b-a91c-f53d4250643e  pbandark_sriov_port_with_cpu_pin_sriov

| 4c1c657c-57b5-45fb-8beb-d25424d6d88b | pbandark_sriov_port_with_cpu_pin_sriov    | ERROR   | -          | NOSTATE     |                 |


I dont see any logs from nova-scheduler but from nova-conductor I can see:

Failed to compute_task_build_instances: No valid host was found. There are not enough hosts available.
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 86, in select_destinations
    filter_properties)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 80, in select_destinations
    raise exception.NoValidHost(reason=reason)

NoValidHost: No valid host was found. There are not enough hosts available.

Environment

  • Red Hat Open Stack

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In