I/O operations to nvme devices are failing

Solution Verified - Updated -

Issue

  • I/O operations to nvme devices are failing.
  • Volume group is missing after nvme disk failure.
  • Commands to nvme devices are getting blocked.
Jun 29 16:46:26 hostname kernel: nvme nvme0: I/O 256 QID 89 timeout, reset controller
Jun 29 16:47:31 hostname kernel: nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10
Jun 29 16:48:12 hostname kernel: INFO: task jbd2/nvme0n1p1-:14377 blocked for more than 120 seconds.
Jun 29 16:48:12 hostname kernel:      Not tainted 4.18.0-305.3.1.el8_4.ppc64le #1
Jun 29 16:48:12 hostname kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 29 16:48:12 hostname kernel: jbd2/nvme0n1p1- D    0 14377      2 0x00000888
Jun 29 16:48:12 hostname kernel: Call Trace:
Jun 29 16:48:12 hostname kernel: [c0002072991837b0] [c0002072a2855c60] 0xc0002072a2855c60 (unreliable)
Jun 29 16:48:12 hostname kernel: [c000207299183990] [c000000000018400] __switch_to+0x2e0/0x500
Jun 29 16:48:12 hostname kernel: [c0002072991839f0] [c000000000ed9198] __schedule+0x2f8/0x9c0
Jun 29 16:48:12 hostname kernel: [c000207299183ac0] [c000000000ed98c8] schedule+0x68/0x130
Jun 29 16:48:12 hostname kernel: [c000207299183af0] [c00800001bd85f18] jbd2_journal_commit_transaction+0x270/0x2220 [jbd2]
Jun 29 16:48:12 hostname kernel: [c000207299183d20] [c00800001bd900dc] kjournald2+0x104/0x380 [jbd2]
Jun 29 16:48:12 hostname kernel: [c000207299183db0] [c0000000001a2f30] kthread+0x1b0/0x1c0
Jun 29 16:48:12 hostname kernel: [c000207299183e20] [c00000000000b7d8] ret_from_kernel_thread+0x5c/0x64
Jun 29 16:48:13 hostname kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x3
Jun 29 16:48:56 hostname kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x3
Jun 29 16:48:56 hostname kernel: nvme nvme0: Removing after probe failure status: -19
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 11872 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 12605455 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
Jun 29 16:48:56 hostname kernel: EXT4-fs error (device nvme0n1p1): ext4_read_inode_bitmap:200: comm dd: Cannot read inode bitmap - block_group = 0, inode_bitmap = 1228
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 12605198 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 12604941 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0

Jun 29 16:48:56 hostname kernel: Aborting journal on device nvme0n1p1-8.
Jun 29 16:48:56 hostname kernel: EXT4-fs error (device nvme0n1p1) in __ext4_new_inode:950: Journal has aborted
Jun 29 16:48:56 hostname kernel: Buffer I/O error on dev nvme0n1p1, logical block 195067904, lost sync page write
Jun 29 16:48:56 hostname kernel: Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
Jun 29 16:48:56 hostname kernel: JBD2: Error -5 detected when updating journal superblock for nvme0n1p1-8.
Jun 29 16:48:56 hostname kernel: EXT4-fs (nvme0n1p1): I/O error while writing superblock

Environment

  • Red Hat Enterprise Linux(RHEL) 8
  • nvme

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content