NFS v4.1 mount stuck, which freezes qemu-kvm

Issue

Check the kernel log and search for "task ... blocked for more than 120 seconds" using dmesg -T or journalctl -b -t kernel.

Here we can see that the qemu process was hung, and the stack trace reveals the NFS nfs_getattr operation:

[Tue Dec  8 04:53:01 2020] INFO: task qemu-kvm:85258 blocked for more than 120 seconds.
[Tue Dec  8 04:53:01 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Dec  8 04:53:01 2020] qemu-kvm        D ffff9f6936cec1c0     0 85258      1 0x000001a0
[Tue Dec  8 04:53:01 2020] Call Trace:
[Tue Dec  8 04:53:01 2020]  [<ffffffff94b0598c>] ? avc_has_perm_flags+0xdc/0x1c0
[Tue Dec  8 04:53:01 2020]  [<ffffffff94f7eb09>] schedule+0x29/0x70
[Tue Dec  8 04:53:01 2020]  [<ffffffff94a66139>] inode_dio_wait+0xd9/0x100
[Tue Dec  8 04:53:01 2020]  [<ffffffff948c6360>] ? wake_bit_function+0x40/0x40
[Tue Dec  8 04:53:01 2020]  [<ffffffffc0784e46>] nfs_getattr+0x1b6/0x250 [nfs]
[Tue Dec  8 04:53:01 2020]  [<ffffffff94a4e959>] vfs_getattr+0x49/0x80
[Tue Dec  8 04:53:01 2020]  [<ffffffff94a4e9d5>] vfs_fstat+0x45/0x80
[Tue Dec  8 04:53:01 2020]  [<ffffffff94a4ef44>] SYSC_newfstat+0x24/0x60
[Tue Dec  8 04:53:01 2020]  [<ffffffff9493c8e4>] ? __audit_syscall_entry+0xb4/0x110
[Tue Dec  8 04:53:01 2020]  [<ffffffff9483c07b>] ? syscall_trace_enter+0x16b/0x220
[Tue Dec  8 04:53:01 2020]  [<ffffffff94a4f31e>] SyS_newfstat+0xe/0x10
[Tue Dec  8 04:53:01 2020]  [<ffffffff94f8c172>] tracesys+0xa6/0xcc

The VM VGA console is not working, however on the serial console we can see the I/O errors:

[stack@undercloud.. ~]$openstack console log show 541728a1-1d6d-4878-8e84-b963decdf96f
[...]
vm login: [11179774.749141] blk_update_request: I/O error, dev vda, sector 139675920
[11179774.773597] blk_update_request: I/O error, dev vda, sector 139675920
[11179774.779591] blk_update_request: I/O error, dev vda, sector 85440272
[stack@undercloud.. ~]$

The mountpoint it is a NFS v4.1 file system:

[root@compute2 ~]# mount |grep nfs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
192.168.1.241:/svm01_nfs_cinder_vol01 on /var/lib/nova/mnt/b673de1fa9b34054adb83e620007e491 type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.57,local_lock=none,addr=192.168.1.241)

The VM is still stuck in newfstat(2) syscall, that is waiting for NFS subsystem. The numver 85258 is the PID of the qemu-kvm process.

[root@compute2 ~]# cat /proc/85258/stack
[<ffffffff94a66139>] inode_dio_wait+0xd9/0x100
[<ffffffffc0784e46>] nfs_getattr+0x1b6/0x250 [nfs]
[<ffffffff94a4e959>] vfs_getattr+0x49/0x80
[<ffffffff94a4e9d5>] vfs_fstat+0x45/0x80
[<ffffffff94a4ef44>] SYSC_newfstat+0x24/0x60
[<ffffffff94a4f31e>] SyS_newfstat+0xe/0x10
[<ffffffff94f8c172>] tracesys+0xa6/0xcc
[<ffffffffffffffff>] 0xffffffffffffffff

If you capture the network traffic towards the NFS server, you can see NFS v4.1 specific COMPOUND operations like the one bellow

Frame 4: 4402 bytes on wire (35216 bits), 4402 bytes captured (35216 bits)
Network File System, Ops(3): SEQUENCE, PUTFH, WRITE
    [Program Version: 4]
    [V4 Procedure: COMPOUND (1)]
    Tag: <EMPTY>
    minorversion: 1
    Operations (count: 3): SEQUENCE, PUTFH, WRITE
        Opcode: SEQUENCE (53)
            sessionid: 0000003765bb82000000000000000403
            seqid: 0x00441ad2
            slot id: 5
            high slot id: 11
            cache this?: Yes
        Opcode: PUTFH (22)
        Opcode: WRITE (38)
    [Main Opcode: WRITE (38)]

The COMPOUND operation fails with NFS4ERR_SEQ_MISORDERED error:

Frame 7: 114 bytes on wire (912 bits), 114 bytes captured (912 bits)
Network File System, Ops(1): SEQUENCE(NFS4ERR_SEQ_MISORDERED)
    [Program Version: 4]
    [V4 Procedure: COMPOUND (1)]
    Status: NFS4ERR_SEQ_MISORDERED (10063)
    Tag: <EMPTY>
    Operations (count: 1)
        Opcode: SEQUENCE (53)
            Status: NFS4ERR_SEQ_MISORDERED (10063)
    [Main Opcode: SEQUENCE (53)]

Environment

Red Hat OpenStack Platform 10
Red Hat Enterprise Linux 7

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

NFS v4.1 mount stuck, which freezes qemu-kvm

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links