System page fault in queue_process() and recursive lock in netpoll_queue()

Solution In Progress - Updated -

Issue

  • The system encountered a null pointer in the queue_process() routine. This caused a page fault and the ensuing code triggered a call to netconsole to report the error. As the queue_lock was already held when the page fault occurred and it attempted to acquire the lock again in netpoll_queue(). This lead to a recursive attempt to take the lock. As it's a spin lock the process entered a loop that it could never return from resulting in the NMI watchdog issuing an NMI that caused an outage.

Environment

  • Red Hat enterprise 5.8
    • 2.6.18-308.11.1.el5
  • Netconsole

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.