1x4vCPU/24GB RHEL 6.5 guest crashing hypervisors in series RHEV

Solution Unverified - Updated -

Issue

A 1 socket, 4 core, 24GB RHEL 6.5 guest has apparently crashed four hypervisors in our cluster. Following the crash of the first hypervisor, the guest was scheduled on a new hypervisor, which soon crashed. The pattern repeated until the problem guest was isolated. Running the 4vCPU/24GB guest on a RHEL 6.5 hypervisor with 9 test guests was sufficient to trigger a crash. The other guests were defined as:

  • 2 x [1 socket/2 cores/5GB]
  • 7 x [1 socket/2 cores/24GB]

For a total of 202GB guest memory. The hypervisor is a Dell PE-R620, 2 E5-2670, 384GB with RHEL 6.5 and kernel 2.6.32-431.20.3.el6.x86_64. All other hypervisors are the same configuration.

Each hypervisor crash appears to be in a different function.

  • We have a vmcore with following traces
 #0 [ffff885f83bbb7d0] machine_kexec at ffffffff81038f3b
 #1 [ffff885f83bbb830] crash_kexec at ffffffff810c5b62
 #2 [ffff885f83bbb900] oops_end at ffffffff8152c430
 #3 [ffff885f83bbb930] die at ffffffff81010e0b
 #4 [ffff885f83bbb960] do_trap at ffffffff8152bc94
 #5 [ffff885f83bbb9c0] do_invalid_op at ffffffff8100cf95
 #6 [ffff885f83bbba60] invalid_op at ffffffff8100bf9b
    [exception RIP: rmap_remove+432]
    RIP: ffffffffa02f17f0  RSP: ffff885f83bbbb18  RFLAGS: 00010292
    RAX: 0000000000000034  RBX: ffff885e064fe030  RCX: 0000000000002b59
    RDX: 0000000000000000  RSI: 0000000000000046  RDI: 0000000000000246
    RBP: ffff885f83bbbb38   R8: ffffffff81c068c0   R9: 0000000000000000
    R10: 0000000000000007  R11: 000000000000000a  R12: ffff8830108d8000
    R13: 00000000000a1cb0  R14: ffff885fccd3eef8  R15: 0000000000000006
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff885f83bbbb10] rmap_remove at ffffffffa02f17f0 [kvm]
 #8 [ffff885f83bbbb40] kvm_mmu_zap_page at ffffffffa02f1f14 [kvm]
 #9 [ffff885f83bbbb80] kvm_mmu_zap_all at ffffffffa02f2428 [kvm]
#10 [ffff885f83bbbbb0] kvm_arch_flush_shadow at ffffffffa02e1c26 [kvm]
#11 [ffff885f83bbbbd0] kvm_mmu_notifier_release at ffffffffa02d6388 [kvm]
#12 [ffff885f83bbbc00] __mmu_notifier_release at ffffffff8116aa6d
#13 [ffff885f83bbbc30] exit_mmap at ffffffff8114e315
#14 [ffff885f83bbbc80] mmput at ffffffff8106ef3c
#15 [ffff885f83bbbca0] exit_mm at ffffffff8107687b
#16 [ffff885f83bbbce0] do_exit at ffffffff81076c2f
#17 [ffff885f83bbbd60] do_group_exit at ffffffff81077398
#18 [ffff885f83bbbd90] get_signal_to_deliver at ffffffff8108cd46
#19 [ffff885f83bbbe30] do_signal at ffffffff8100a265
#20 [ffff885f83bbbf30] do_notify_resume at ffffffff8100aa80
#21 [ffff885f83bbbf50] int_signal at ffffffff8100b341
    RIP: 00007f18e321298e  RSP: 00007f18d2762b50  RFLAGS: 00000206
    RAX: fffffffffffffdfc  RBX: 0000000000000000  RCX: ffffffffffffffff
    RDX: 00000000000029ee  RSI: 0000000000000189  RDI: 00007f18e3e0c624
    RBP: 00007f18d2762be0   R8: 00007f18e3e0c5e0   R9: 00000000ffffffff
    R10: 00007f18d2762be0  R11: 0000000000000206  R12: 00000000000029ee
    R13: 00007f18d2762be0  R14: ffffffffffffff92  R15: 0000000000000000
    ORIG_RAX: 00000000000000ca  CS: 0033  SS: 002b

Environment

  • Red Hat Enterprise Virtualization - 3.3
  • RHEL6.5 2.6.32-431.20.3.el6.x86_64

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.