NVMe volumes go offline/reset due to nvme io_timeout - "QID 1 timeout, aborting"
Issue
- Systems can suffer volume access loss when using NVMe volumes, because of a relatively tight io_timeout triggering I/O errors.
- When such volumes are used for root and timeouts happen, the instance can be come unresponsive.
- How set set io_timeout value to prevent timeouts with nvme storage.
Environment
- Red Hat Enterprise Linux 7 (RHEL)
- Red Hat Enterprise Linux 8
- Red Hat Enterprise Linux 9
- Non Volatile Memory Express (NVMe)
- Virtual environments or emulated environments where NVME volumes are being utilized
- nvme_core.io_timeout set to low values for emulated devices
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.