RHEL6.2: NFS stops responding for up to 20 seconds and then keeps going when copying large files.

Solution Verified - Updated -

Issue

  • Customer encountering issue on RHEL6.2 (kernel version 2.6.32-220.23.1.el6.x86_64) with NFS stopping responding for up to 20 seconds and then keeps going when copying large files.
  • During this time the network responds to pings but no NFS traffic is seen.
  • This problem happened with bonded interfaces using tg3 driver only, bonded with one of tg3 and e1000e, and a standalone e1000e interface (no bonding).
  • They usually see it happen once or twice a day. To try and provoke the issue faster they run the following command in a loop (I'm not sure if they've supplied the exact command or if that's just an illustration of the command used to try and trigger the issue):
   cp -f /NFS/largefile /NFS/largefile.1
   sleep 60
  • Backtrace is as follows:
INFO: task cp:24962 blocked for more than 10 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cp            D 0000000000000002     0 24962   2659 0x00000080
 ffff88081fc4dc78 0000000000000082 0000000000000000 ffff8800366128a8
 ffff88081fc4dbe8 ffffffff81012b59 ffff88081fc4dc28 ffffffff8109b949
 ffff8808328805f8 ffff88081fc4dfd8 000000000000f4e8 ffff8808328805f8
Call Trace:
 [<ffffffff81012b59>] ? read_tsc+0x9/0x20
 [<ffffffff8109b949>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff81110d60>] ? sync_page+0x0/0x50
 [<ffffffff814ed9e3>] io_schedule+0x73/0xc0
 [<ffffffff81110d9d>] sync_page+0x3d/0x50
 [<ffffffff814ee39f>] __wait_on_bit+0x5f/0x90
 [<ffffffff81110f53>] wait_on_page_bit+0x73/0x80
 [<ffffffff81090d70>] ? wake_bit_function+0x0/0x50
 [<ffffffff811273f5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff8111136b>] wait_on_page_writeback_range+0xfb/0x190
 [<ffffffff81111538>] filemap_write_and_wait_range+0x78/0x90
 [<ffffffff811a589e>] vfs_fsync_range+0x7e/0xe0
 [<ffffffff811a596d>] vfs_fsync+0x1d/0x20
 [<ffffffffa02f96d0>] nfs_file_flush+0x70/0xa0 [nfs]
 [<ffffffff81173e4c>] filp_close+0x3c/0x90
 [<ffffffff81173f45>] sys_close+0xa5/0x100
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Kernel panic - not syncing: hung_task: blocked tasks
Pid: 177, comm: khungtaskd Tainted: G        W  ----------------   2.6.32-220.23.1.el6.x86_64 #1
Call Trace:
 [<ffffffff814ecb34>] ? panic+0x78/0x143
 [<ffffffff810d8bd7>] ? watchdog+0x217/0x220
 [<ffffffff810d89c0>] ? watchdog+0x0/0x220
 [<ffffffff810909c6>] ? kthread+0x96/0xa0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81090930>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20

Environment

  • Red Hat Enterprise Linux 6.2
    • kernels from 6.2 (2.6.32-220.*el6); seen on 2.6.32-220.23.1.el6
    • NFS client

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content