System page fault in queue_process() and recursive lock in netpoll_queue()
Issue
- The system encountered a null pointer in the queue_process() routine. This caused a page fault and the ensuing code triggered a call to netconsole to report the error. As the queue_lock was already held when the page fault occurred and it attempted to acquire the lock again in netpoll_queue(). This lead to a recursive attempt to take the lock. As it's a spin lock the process entered a loop that it could never return from resulting in the NMI watchdog issuing an NMI that caused an outage.
Environment
- Red Hat enterprise 5.8
- 2.6.18-308.11.1.el5
- Netconsole
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.