Why NFS server process backed by rbd device goes in uninterruptible 'D' state while waiting on I/O completion in RHCS ?

Solution Verified - Updated -

Issue

  • The nfsd processes blocking for more than 120 seconds as I/O getting stuck on rbd device

    [73437.433769] INFO: task nfsd:1230 blocked for more than 120 seconds.
    [73437.523742] INFO: task nfsd:1231 blocked for more than 120 seconds.
    [73437.663482] INFO: task nfsd:1232 blocked for more than 120 seconds.
    [73437.803338] INFO: task nfsd:1233 blocked for more than 120 seconds.
    [73437.942544] INFO: task nfsd:1234 blocked for more than 120 seconds.
    [73438.081642] INFO: task nfsd:1235 blocked for more than 120 seconds.
    [73438.221544] INFO: task nfsd:1236 blocked for more than 120 seconds.
    
  • /var/log/dmesg reporting process stuck in io_schedule calls

    kernel: INFO: task nfsd:14430 blocked for more than 120 seconds.
    kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    kernel: nfsd            D ffff905d45bde2a0     0 14430      2 0x00000080
    kernel: Call Trace:
    kernel:  [<ffffffffc05107a4>] ? ixgbe_xmit_frame+0x44/0x80 [ixgbe]
    kernel:  [<ffffffff8c97d1f0>] ? bit_wait+0x50/0x50
    kernel:  [<ffffffff8c97f229>] schedule+0x29/0x70
    kernel:  [<ffffffff8c97cbb1>] schedule_timeout+0x221/0x2d0
    kernel:  [<ffffffff8c87b026>] ? sch_direct_xmit+0x86/0x250
    kernel:  [<ffffffff8c3047e2>] ? ktime_get_ts64+0x52/0xf0
    kernel:  [<ffffffff8c97d1f0>] ? bit_wait+0x50/0x50
    kernel:  [<ffffffff8c97e79d>] io_schedule_timeout+0xad/0x130
    kernel:  [<ffffffff8c97e838>] io_schedule+0x18/0x20
    kernel:  [<ffffffff8c97d201>] bit_wait_io+0x11/0x50
    kernel:  [<ffffffff8c97cdb1>] __wait_on_bit_lock+0x61/0xc0
    kernel:  [<ffffffff8c3bb374>] __lock_page+0x74/0x90
    kernel:  [<ffffffff8c2c6280>] ? wake_bit_function+0x40/0x40
    kernel:  [<ffffffff8c47c062>] __generic_file_splice_read+0x5c2/0x5e0
    kernel:  [<ffffffff8c47a920>] ? page_cache_pipe_buf_release+0x20/0x20
    kernel:  [<ffffffff8c2a3667>] ? local_bh_enable+0x17/0x20
    kernel:  [<ffffffff8c89e584>] ? ip_finish_output+0x284/0x8d0
    kernel:  [<ffffffff8c586694>] ? __radix_tree_lookup+0x84/0xf0
    kernel:  [<ffffffff8c586694>] ? __radix_tree_lookup+0x84/0xf0
    kernel:  [<ffffffffc0a047c0>] ? nfsd_proc_create+0x5c0/0x5c0 [nfsd]
    kernel:  [<ffffffffc05aa083>] ? xfs_iget+0x513/0x860 [xfs]
    kernel:  [<ffffffffc069d20f>] ? cache_check+0xef/0x390 [sunrpc]
    kernel:  [<ffffffff8c47c464>] generic_file_splice_read+0x44/0x90
    kernel:  [<ffffffffc05a4d7e>] xfs_file_splice_read+0x16e/0x190 [xfs]
    kernel:  [<ffffffff8c47b285>] do_splice_to+0x75/0x90
    kernel:  [<ffffffff8c47b357>] splice_direct_to_actor+0xb7/0x200
    kernel:  [<ffffffffc0a05860>] ? fsid_source+0x60/0x60 [nfsd]
    kernel:  [<ffffffffc0a06d38>] nfsd_splice_read+0x68/0xa0 [nfsd]
    kernel:  [<ffffffffc0a17dbb>] nfsd4_encode_read+0x3ab/0x550 [nfsd]
    kernel:  [<ffffffffc0a21041>] nfsd4_encode_operation+0x81/0x1b0 [nfsd]
    kernel:  [<ffffffffc0a16670>] nfsd4_proc_compound+0x240/0x780 [nfsd]
    kernel:  [<ffffffffc0a01810>] nfsd_dispatch+0xe0/0x290 [nfsd]
    kernel:  [<ffffffffc0692323>] svc_process_common+0x3d3/0x7c0 [sunrpc]
    kernel:  [<ffffffffc0692813>] svc_process+0x103/0x190 [sunrpc]
    kernel:  [<ffffffffc0a0116f>] nfsd+0xdf/0x150 [nfsd]
    kernel:  [<ffffffffc0a01090>] ? nfsd_destroy+0x80/0x80 [nfsd]
    kernel:  [<ffffffff8c2c50d1>] kthread+0xd1/0xe0
    kernel:  [<ffffffff8c2c5000>] ? insert_kthread_work+0x40/0x40
    kernel:  [<ffffffff8c98cd24>] ret_from_fork_nospec_begin+0xe/0x2
    

Environment

  • Red Hat Enterprise Linux 7.x
  • Red Hat Ceph Storage 3.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In