24.21. Configuring Maximum Time for Error Recovery with eh_deadline


In most scenarios, you do not need to enable the eh_deadline parameter. Using the eh_deadline parameter can be useful in certain specific scenarios, for example if a link loss occurs between a Fibre Channel switch and a target port, and the Host Bus Adapter (HBA) does not receive Registered State Change Notifications (RSCNs). In such a case, I/O requests and error recovery commands all time out rather than encounter an error. Setting eh_deadline in this environment puts an upper limit on the recovery time, which enables the failed I/O to be retried on another available path by multipath.
However, if RSCNs are enabled, the HBA does not register the link becoming unavailable, or both, the eh_deadline functionality provides no additional benefit, as the I/O and error recovery commands fail immediately, which allows multipath to retry.
The SCSI host object eh_deadline parameter enables you to configure the maximum amount of time that the SCSI error handling mechanism attempts to perform error recovery before stopping and resetting the entire HBA.
The value of the eh_deadline is specified in seconds. The default setting is off, which disables the time limit and allows all of the error recovery to take place. In addition to using sysfs, a default value can be set for all SCSI HBAs by using the scsi_mod.eh_deadline kernel parameter.
Note that when eh_deadline expires, the HBA is reset, which affects all target paths on that HBA, not only the failing one. As a consequence, I/O errors can occur if some of the redundant paths are not available for other reasons. Enable eh_deadline only if you have a fully redundant multipath configuration on all targets.