When disk I/O intensive and CPU non-intensive workload, soft lockup occurred in RHEL KVM guests with Red Hat Enterprise Linux 6 host.

Solution Verified - Updated -

Issue

  • When disk I/O intensive and CPU non-intensive workload (, for example, when file is read by the dd command in KVM guests), softlockup occurs about once for ten times as follows and reading time becomes longer than usual.
    • Looking at the performance meter for disk I/O, disk I/O on the guest OS seems to stop when this problem occurs.
    • Programs running on RHEL KVM guest will not have reliable performance.
    • Also the long delays caused by this problem will impact service running on RHEL KVM guests.
  • Though we can detect this problem easily on RHEL5.5, but the problem occurs on RHEL4, RHEL5 and RHEL6 KVM guests, not only RHEL5.
  • On RHEL5.5:
    • The problem could be easily detected by the soft lockup message, when the hung task detector was introduced as follows:
    • kernel: BUG: soft lockup - CPU#1 stuck for 16s! [swapper:0] 
      kernel: 
      kernel: Pid: 0, comm: swapper 
      kernel: EIP: 0060:[<c0403be1>] CPU: 1 
      kernel: EIP is at default_idle+0x31/0x59 
      kernel: EFLAGS: 00000246 Not tainted (2.6.18-194.el5 #1) 
      kernel: EAX: 00000000 EBX: 00000001 ECX: c0403bb0 EDX: f7d4a000 
      kernel: ESI: 000084f4 EDI: 00000000 EBP: 00000000 DS: 007b ES: 007b 
      kernel: CR0: 8005003b CR2: bfec2ee4 CR3: 01bb3000 CR4: 00000690 
      kernel: [<c0403ca8>] cpu_idle+0x9f/0xb9 
      kernel: =======================
      
  • RHEL5 up to RHEL5.4 and RHEL4:
    • There's this problem, although without the soft lockup message.
  • RHEL5.6 and after (including RHEL6):
    • They also have this problem.
    • In this case, the problem is a little harder to reproduce.
    • Using native aio, which was introduced in RHEL6.1 on the host, will accelerate reproduction of this problem.

Environment

  • Host : Red Hat Enterprise Linux 6
    • kernel version: 2.6.71.el6.x86_64 or later
    • qemu-kvm-0.12.1.2-2.113.el6.x86_64 or later
  • Guest: Red Hat Enterprise Linux 4 / 5 / 6
    • with/without virtio.
  • This problem has only been observed on Intel Nehalem CPUs.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In