Amazon AWS "ena" driver hangs and resets with "TX hasn't completed"
Issue
- AWS instances intermittently perform a network interface hang and reset with logs:
ena: TX hasn't completed, qid X, index XXX. XXXXXXXX usecs from last napi execution, napi scheduled: 1
...
ena: NETDEV WATCHDOG: CPU: X: transmit queue X timed out 5208 ms
ena: Free uncompleted tx skb qid X idx 0xXXX
ena: ENA device version: 0.10
ena: ENA controller version: 0.0.1 implementation version 1
ena: Device reset completed successfully
- This looks like known resolved issue ena driver TX timeouts leading to softirq hangs but the error message is different.
Environment
- Red Hat Enterprise Linux 9.6
kernel-5.14.0-570.16.1.el9_6.x86_64
- Amazon AWS EC2 instance
ena
Elastic Network Adapter
- Script checking for kernel module configuration with:
modprobe -ac --show-exports
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.