1x4vCPU/24GB RHEL 6.5 guest crashing hypervisors in series RHEV
Issue
A 1 socket, 4 core, 24GB RHEL 6.5 guest has apparently crashed four hypervisors in our cluster. Following the crash of the first hypervisor, the guest was scheduled on a new hypervisor, which soon crashed. The pattern repeated until the problem guest was isolated. Running the 4vCPU/24GB guest on a RHEL 6.5 hypervisor with 9 test guests was sufficient to trigger a crash. The other guests were defined as:
- 2 x [1 socket/2 cores/5GB]
- 7 x [1 socket/2 cores/24GB]
For a total of 202GB guest memory. The hypervisor is a Dell PE-R620, 2 E5-2670, 384GB with RHEL 6.5 and kernel 2.6.32-431.20.3.el6.x86_64. All other hypervisors are the same configuration.
Each hypervisor crash appears to be in a different function.
- We have a vmcore with following traces
#0 [ffff885f83bbb7d0] machine_kexec at ffffffff81038f3b
#1 [ffff885f83bbb830] crash_kexec at ffffffff810c5b62
#2 [ffff885f83bbb900] oops_end at ffffffff8152c430
#3 [ffff885f83bbb930] die at ffffffff81010e0b
#4 [ffff885f83bbb960] do_trap at ffffffff8152bc94
#5 [ffff885f83bbb9c0] do_invalid_op at ffffffff8100cf95
#6 [ffff885f83bbba60] invalid_op at ffffffff8100bf9b
[exception RIP: rmap_remove+432]
RIP: ffffffffa02f17f0 RSP: ffff885f83bbbb18 RFLAGS: 00010292
RAX: 0000000000000034 RBX: ffff885e064fe030 RCX: 0000000000002b59
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff885f83bbbb38 R8: ffffffff81c068c0 R9: 0000000000000000
R10: 0000000000000007 R11: 000000000000000a R12: ffff8830108d8000
R13: 00000000000a1cb0 R14: ffff885fccd3eef8 R15: 0000000000000006
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff885f83bbbb10] rmap_remove at ffffffffa02f17f0 [kvm]
#8 [ffff885f83bbbb40] kvm_mmu_zap_page at ffffffffa02f1f14 [kvm]
#9 [ffff885f83bbbb80] kvm_mmu_zap_all at ffffffffa02f2428 [kvm]
#10 [ffff885f83bbbbb0] kvm_arch_flush_shadow at ffffffffa02e1c26 [kvm]
#11 [ffff885f83bbbbd0] kvm_mmu_notifier_release at ffffffffa02d6388 [kvm]
#12 [ffff885f83bbbc00] __mmu_notifier_release at ffffffff8116aa6d
#13 [ffff885f83bbbc30] exit_mmap at ffffffff8114e315
#14 [ffff885f83bbbc80] mmput at ffffffff8106ef3c
#15 [ffff885f83bbbca0] exit_mm at ffffffff8107687b
#16 [ffff885f83bbbce0] do_exit at ffffffff81076c2f
#17 [ffff885f83bbbd60] do_group_exit at ffffffff81077398
#18 [ffff885f83bbbd90] get_signal_to_deliver at ffffffff8108cd46
#19 [ffff885f83bbbe30] do_signal at ffffffff8100a265
#20 [ffff885f83bbbf30] do_notify_resume at ffffffff8100aa80
#21 [ffff885f83bbbf50] int_signal at ffffffff8100b341
RIP: 00007f18e321298e RSP: 00007f18d2762b50 RFLAGS: 00000206
RAX: fffffffffffffdfc RBX: 0000000000000000 RCX: ffffffffffffffff
RDX: 00000000000029ee RSI: 0000000000000189 RDI: 00007f18e3e0c624
RBP: 00007f18d2762be0 R8: 00007f18e3e0c5e0 R9: 00000000ffffffff
R10: 00007f18d2762be0 R11: 0000000000000206 R12: 00000000000029ee
R13: 00007f18d2762be0 R14: ffffffffffffff92 R15: 0000000000000000
ORIG_RAX: 00000000000000ca CS: 0033 SS: 002b
Environment
- Red Hat Enterprise Virtualization - 3.3
- RHEL6.5 2.6.32-431.20.3.el6.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.