NVMe volumes go offline/reset due to nvme io_timeout - "QID 1 timeout, aborting"

Solution Verified - Updated -

Issue

  • Systems can suffer volume access loss when using NVMe volumes, because of a relatively tight io_timeout triggering I/O errors.
  • When such volumes are used for root and timeouts happen, the instance can be come unresponsive.
  • How set set io_timeout value to prevent timeouts with nvme storage.

Environment

  • Red Hat Enterprise Linux 7 (RHEL)
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 9
  • Non Volatile Memory Express (NVMe)
  • Virtual environments or emulated environments where NVME volumes are being utilized
  • nvme_core.io_timeout set to low values for emulated devices

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content