System crash during IO errors on nbd devices

Solution Unverified - Updated -

Issue

  • During the issues in connectivity to nbd devices, the IOs started failing with following errors. Shortly after these errors, the system got crashed:

    block nbd1: Connection timed out, retrying (3/3 alive)
    block nbd1: Connection timed out, retrying (2/3 alive)
    block nbd1: Receive control failed (result -32)
    block nbd1: Connection timed out, retrying (2/3 alive)
    block nbd1: Connection timed out, retrying (1/3 alive)
    block nbd1: Receive control failed (result -32)
    block nbd1: NBD_DISCONNECT
    block nbd1: Send disconnect failed -32
    block nbd1: Send disconnect failed -32
    block nbd1: Disconnected due to user request.
    block nbd1: shutting down sockets
    block nbd1: Connection timed out, retrying (0/3 alive)
    block nbd1: Connection timed out, retrying (0/3 alive)
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000118
    PGD 8000000107920067 
    blk_update_request: I/O error, dev nbd1, sector 7632 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
    [...]
    block nbd1: NBD_DISCONNECT
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 108135 Comm: kworker/0:0H Kdump: loaded Tainted: G        W         -------- -  - 4.18.0-553.8.1.el8_10.x86_64 #1
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
    Workqueue: kblockd blk_mq_requeue_work
    block nbd1: Send disconnect failed -32        
    RIP: 0010:dd_insert_requests+0x30/0x330
    block nbd1: Send disconnect failed -32
    Code: 57 41 56 41 55 41 54 55 53 48 83 ec 40 48 89 7c 24 10 48 89 34 24 88 54 24 1f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 <48> 8b 87 18 01 00 00 48 8b 40 18 48 8b 40 08 48 05 20 01 00 00 48
    RSP: 0018:ffffad702d3f7d80 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffffa02f65844990 RCX: ffffa02f658449d8
    block nbd1: Send disconnect failed -32
    RDX: 0000000000000001 RSI: ffffad702d3f7e00 RDI: 0000000000000000
    XFS (dm-15): log I/O error -5
    RBP: 0000000000000001 R08: ffffad702d3f7e00 R09: 00646b636f6c626b
    R10: 8080808080808080 R11: ffffffffa685c048 R12: ffffa02e9679d400
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffffa0355fc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000118 CR3: 0000000107924000 CR4: 00000000000006f0
    Call Trace:
     ? __die_body+0x1a/0x60
     ? no_context+0x1ba/0x3f0
     ? __bad_area_nosemaphore+0x157/0x180
     ? insert_work+0x65/0xc0
     ? do_page_fault+0x37/0x12d
     ? page_fault+0x1e/0x30
     ? dd_insert_requests+0x30/0x330
     blk_mq_sched_insert_request+0xc2/0x140
     blk_mq_requeue_work+0x10d/0x180
     process_one_work+0x1d3/0x390
     ? process_one_work+0x390/0x390
     worker_thread+0x30/0x390
     ? process_one_work+0x390/0x390
     kthread+0x134/0x150
     ? set_kthread_struct+0x50/0x50
     ret_from_fork+0x35/0x40
    

Environment

  • Red Hat Enterprise Linux 8.10
    • kernel-4.18.0-553.8.1.el8_10
  • Red Hat Enterprise Linux 9
    • All Red Hat Enterprise Linux 9 kernel versions older than 9.6 GA (5.14.0-570.12.1.el9_6)
  • Network Block Device (NBD)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content