Instance with high CPU Steal due to most instances being scheduled on the same pCPU in Red Hat OpenStack Platform
Issue
Very high CPU steal on several instances.
Symptoms:
- instances show more than 50% CPU steal regularly.
- most instances run their vCPUs on pCPU 0 of the hypervisors (run this on all hypervisors to find the top scheduled CPUs at a given moment)
virsh list | awk '{print $2}' | xargs -I {} virsh vcpuinfo {} | egrep '^CPU\:' | awk '{print $NF}' | sort | uniq -c | sort -nr
nova4 | SUCCESS | rc=0 >>
39 0
9 8
9 6
(...)
nova7 | SUCCESS | rc=0 >>
55 0
4 21
4 13
3 22
3 18
(...)
nova8 | SUCCESS | rc=0 >>
44 0
5 9
4 5
4 4
(...)
nova10 | SUCCESS | rc=0 >>
43 0
7 9
6 8
6 7
6 5
(...)
nova14 | SUCCESS | rc=0 >>
21 0
3 21
2 9
2 7
(...)
Other details about environment:
isolcpus
is configured in grub for pCPUs 0 to 3- it seems that most vCPUs get mostly scheduled on CPU 0 of the hypervisors
- these hypervisors were configured with isolcpus kernel command line parameter.
Theory:
- some bug in the scheduler (possibly triggered due to isolcpus) puts most vCPUs on CPU 0 and thus creates high contention for that CPU and high steal values within the VMs
Environment
Red Hat OpenStack Enterprise Linux Platform 7.0
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.