Qlogic qla2xxx firmware errors causing spinlock contention and deadlock among CPUs
Issue
- System crashed and rebooted after a watchdog triggered NMI among the CPUs. Need to perform Root Cause Analysis.
- The system kernel buffer was logged with numerous qla2xxx HBA firmware dump errors
- High outstanding inflight IO seen on a couple of scsi devices on the system
qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
NMI Watchdog detected LOCKUP on CPU 5 << lockup on CPU 5
CPU 5
Modules linked in:<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
seos(PU)<4>qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
twnotify(U) mptctl mptbase vxodm(PFU) vxfen(PU) gab(PU) llt(PU)<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
autofs4<4>qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
hidp nfs fscache nfs_acl rfcomm l2cap bluetooth<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
dmpaa(PU)<4>qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
vxspec(PFU) vxio(PFU) vxdmp(PU) lockd sunrpc be2iscsi(U)<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
ib_iser<4>qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
bnx2i<4>qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
libiscsi_tcp<4>qla2xxx 0000:06:00.0: Firmware has been previously dumped (ffffc20010094000) -- ignoring request...
libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi vxportal(PFU) fdd(PFU) vxfs(PU)<6>qla2xxx 0000:06:00.0: RISC paused -- HCCR=8040, Dumping firmware!
Environment
- Red Hat Enterprise Linux 5.6
- Qlogic qla2xxx module for HBA
- Veritas Cluster
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
