Linux guests under RHEV 3.0 hang or go to 'Down' status unexpectedly

Solution Unverified - Updated -

Issue

  • Guests under RHEV 3.0 unexpectedly hang (stop responding) or crashes and reboots. This can be triggered by a number of circumstances:

    • Crash may occur during live migration, crashing just after "setting migration downtime to 300" message
    • Crash may occur when attempting to halt the guest
    • Crash may occur at other times
  • The crash may have a backtrace such as:

--- <NMI exception stack> ---
 #6 [ffff81031fcf6fd8] iret_label at ffffffff8005d67c
    [ NMI exception stack recursion: prior stack location overwritten ]

Environment

  • Red Hat Enterprise Virtualization (RHEV-M) 3.0
    • RHEV 3.1 and 3.2 may also be affected, but it appears to be much harder to trigger this bug on those versions
  • Hypervisor (RHEL or RHEV-H):
    • kernel-2.6.32-358.el6.x86_64
    • qemu-kvm-0.12.1.2-2.295.el6_3.5
  • Linux guest (RHEL 5, RHEL 6, possibly other Linux distros)
  • NMI watchdog is enabled in guest

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.