I/O operations to nvme devices are failing
Issue
- I/O operations to nvme devices are failing.
- Volume group is missing after nvme disk failure.
- Commands to nvme devices are getting blocked.
Jun 29 16:46:26 hostname kernel: nvme nvme0: I/O 256 QID 89 timeout, reset controller
Jun 29 16:47:31 hostname kernel: nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10
Jun 29 16:48:12 hostname kernel: INFO: task jbd2/nvme0n1p1-:14377 blocked for more than 120 seconds.
Jun 29 16:48:12 hostname kernel: Not tainted 4.18.0-305.3.1.el8_4.ppc64le #1
Jun 29 16:48:12 hostname kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 29 16:48:12 hostname kernel: jbd2/nvme0n1p1- D 0 14377 2 0x00000888
Jun 29 16:48:12 hostname kernel: Call Trace:
Jun 29 16:48:12 hostname kernel: [c0002072991837b0] [c0002072a2855c60] 0xc0002072a2855c60 (unreliable)
Jun 29 16:48:12 hostname kernel: [c000207299183990] [c000000000018400] __switch_to+0x2e0/0x500
Jun 29 16:48:12 hostname kernel: [c0002072991839f0] [c000000000ed9198] __schedule+0x2f8/0x9c0
Jun 29 16:48:12 hostname kernel: [c000207299183ac0] [c000000000ed98c8] schedule+0x68/0x130
Jun 29 16:48:12 hostname kernel: [c000207299183af0] [c00800001bd85f18] jbd2_journal_commit_transaction+0x270/0x2220 [jbd2]
Jun 29 16:48:12 hostname kernel: [c000207299183d20] [c00800001bd900dc] kjournald2+0x104/0x380 [jbd2]
Jun 29 16:48:12 hostname kernel: [c000207299183db0] [c0000000001a2f30] kthread+0x1b0/0x1c0
Jun 29 16:48:12 hostname kernel: [c000207299183e20] [c00000000000b7d8] ret_from_kernel_thread+0x5c/0x64
Jun 29 16:48:13 hostname kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x3
Jun 29 16:48:56 hostname kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x3
Jun 29 16:48:56 hostname kernel: nvme nvme0: Removing after probe failure status: -19
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 11872 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 12605455 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
Jun 29 16:48:56 hostname kernel: EXT4-fs error (device nvme0n1p1): ext4_read_inode_bitmap:200: comm dd: Cannot read inode bitmap - block_group = 0, inode_bitmap = 1228
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 12605198 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Jun 29 16:48:56 hostname kernel: blk_update_request: I/O error, dev nvme0n1, sector 12604941 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Jun 29 16:48:56 hostname kernel: Aborting journal on device nvme0n1p1-8.
Jun 29 16:48:56 hostname kernel: EXT4-fs error (device nvme0n1p1) in __ext4_new_inode:950: Journal has aborted
Jun 29 16:48:56 hostname kernel: Buffer I/O error on dev nvme0n1p1, logical block 195067904, lost sync page write
Jun 29 16:48:56 hostname kernel: Buffer I/O error on dev nvme0n1p1, logical block 0, lost sync page write
Jun 29 16:48:56 hostname kernel: JBD2: Error -5 detected when updating journal superblock for nvme0n1p1-8.
Jun 29 16:48:56 hostname kernel: EXT4-fs (nvme0n1p1): I/O error while writing superblock
Environment
- Red Hat Enterprise Linux(RHEL) 8
- nvme
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.