RHEL 5 virtual machine hangs when doing NFS file copy over virtio-net network interface

Solution Verified - Updated -

Environment

  • RHEV hypervisor
  • RHEL 5.11 32-bit virtual machine
  • kernels 2.6.18-398.el5.i686 and 2.6.18-400.1.1.el5.i686
  • virtio-net network interface
  • RHEL 5 VM as NFS Client

Issue

  • When I try to write files on a remote NFS filesystem using the VirtIO device, the system hangs up. Doing the same procedure using e1000 driver, the problem doesn't occur.
  • New installation of RHEL 5.11 32-bit i686 virtual machine hangs when doing NFS file copy over virtio-net network interface

Resolution

Workaround

Use the e1000 virtual interface type, instead of the virtio-net interface type.

Diagnostic Steps

Time source is kvm-clock:

$ grep -i time var/log/dmesg 
ACPI: PM-Timer IO Port: 0xb008
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
Using local APIC timer interrupts.
WARNING calibrate_APIC_clock: the APIC timer calibration may be wrong.
Real Time Clock Driver v1.12ac
Time: kvm-clock clocksource has been installed.

Tried different clocksources but no difference:

clocksource=acpi_pm
clocksource=pit
clocksource=tsc

Captured vmcore. Logs show both IDE hard drive and network appear to have hung:

hdc: drive_cmd: status=0x41 { DriveReady Error }
hdc: drive_cmd: error=0x04 { AbortedCommand }
ide: failed opcode was: 0xec
nfs: server 10.0.0.144 not responding, still trying
nfs: server 10.0.0.144 not responding, still trying
nfs: server 10.0.0.144 not responding, still trying
INFO: task cp:3652 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cp            D 0000008F  1716  3652   3613                     (NOTLB)
       f79e7dc4 00200086 a636f013 0000008f c041d52b a9124215 0000008f 00000009 
       f7c6d190 a6428707 0000008f 000b96f4 00000000 f7c6d29c ca00c580 f7c39200 
       00000184 00000000 f79556cc f79e7e00 0006b0b2 c042e4c3 f79556cc ffffffff 
Call Trace:
 [<c041d52b>] pvclock_clocksource_read+0x101/0x117
 [<c042e4c3>] getnstimeofday+0x30/0xb6
 [<c0628148>] io_schedule+0x36/0x59
 [<c045bdfa>] sync_page+0x38/0x3b
 [<c062831f>] __wait_on_bit+0x33/0x58
 [<c045bdc2>] sync_page+0x0/0x3b
 [<c045be82>] wait_on_page_bit+0x5b/0x62
 [<c0438448>] wake_bit_function+0x0/0x3c
 [<c045c816>] wait_on_page_writeback_range+0x4d/0xf6
 [<c045cb90>] filemap_write_and_wait+0x1f/0x29
 [<f8ded274>] nfs_setattr+0xac/0x111 [nfs]
 [<c045ba2f>] file_read_actor+0x0/0xd5
 [<c045cfe3>] generic_file_aio_read+0x3d/0x44
 [<c047928f>] do_sync_read+0xb6/0xf1
 [<c042b2c6>] current_fs_time+0x4a/0x54
 [<c04923b5>] notify_change+0x117/0x2ae
 [<c047902a>] sys_fchmod+0xa4/0xc2
 [<c0404f9b>] syscall_call+0x7/0xb
 =======================
Kernel panic - not syncing: hung_task: blocked tasks

IDE Disk is virtual CD-ROm drive, but removing this shows no change.

Using NFS rsize=32768,wsize=32768 shows no change.

vmcore analysis:

  • panic context is khungtaskd, the other CPU core is idle:
crash> bt -a
PID: 191    TASK: f7f11450  CPU: 0   COMMAND: "khungtaskd"
 #0 [f7c76f50] crash_kexec at c0447179
 #1 [f7c76f94] panic at c0426641
 #2 [f7c76fb0] watchdog at c0451a75
 #3 [f7c76fcc] kthread at c0438358
 #4 [f7c76fe4] kernel_thread_helper at c0405cd5
PID: 0      TASK: f7c01050  CPU: 1   COMMAND: "swapper"
 #0 [f7c9ff08] crash_nmi_callback at c041b3f0
 #1 [f7c9ff54] do_nmi at c0406895
 #2 [f7c9ff7c] nmi at c0405bad
    EAX: 00000000  EBX: 00000001  ECX: c0403c24  EDX: f7c9f000  EBP: 00000000 
    DS:  007b      ESI: 0000e020  ES:  007b      EDI: 00000000
    CS:  0060      EIP: c0403c55  ERR: 00000000  EFLAGS: 00000246 
 #3 [f7c9ffb0] default_idle at c0403c55
 #4 [f7c9ffb4] cpu_idle at c0403d1a

Change context to the hung task:

crash> ps | grep cp
   2538   2462   0  ca3e8690  UN   0.0    4308    700  cp

crash> set 2538
    PID: 2538
COMMAND: "cp"
   TASK: ca3e8690  [THREAD_INFO: f7a36000]
    CPU: 0
  STATE: TASK_UNINTERRUPTIBLE 

However the backtrace shows the process is now at a different point of execution:

crash> bt
PID: 2538   TASK: ca3e8690  CPU: 0   COMMAND: "cp"
 #0 [f7a36e60] schedule at c0627b52
 #1 [f7a36ed8] io_schedule at c06281f3
 #2 [f7a36ee4] nfs_wait_bit_uninterruptible at f8df08a4 [nfs]
 #3 [f7a36ee8] __wait_on_bit at c06283cc
 #4 [f7a36f00] out_of_line_wait_on_bit at c0628451
 #5 [f7a36f38] nfs_wait_on_request at f8df0898 [nfs]
 #6 [f7a36f44] nfs_sync_inode_wait at f8df4c94 [nfs]
 #7 [f7a36f88] nfs_do_fsync at f8deb4d3 [nfs]
 #8 [f7a36f98] filp_close at c0477c92
 #9 [f7a36fa8] sys_close at c0478f07
#10 [f7a36fb8] system_call at c0404f94
    EAX: ffffffda  EBX: 00000004  ECX: bfbd7858  EDX: 080593a4 
    DS:  007b      ESI: 00000003  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bfbd4388  EBP: bfbd7508
    CS:  0073      EIP: 00651402  ERR: 00000006  EFLAGS: 00000246 

The hung task timeout may simply be too short.

The log and core contain io_schedule executing at different points, though the core is executing at a point earlier than the log, so that function may be looping.

Increasing the hung task timeout may relieve the issue:

sysctl -w kernel.hung_task_timeout_secs=240

If not, using the e1000 interface provides a workaround, use this.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments