System hanged due to deadlock between CPUs.
Issue
- Our customer encountered a problem that NMI interrupt was received due to BMC watchdog timeout.
- Analyzing vmcore, it was seen that it was in deadlock status between CPU1and CPU2, and all CPUs were in hardware interrupt disabled status.
- From this situation, fork() on CPU0 also waited indefinitely for lock acquirement and then WDTThread running on CPU0 could not work and accordingly went into watchdog timeout.
Environment
- Red Hat Enterprise Linux 5.5
- kernel-2.6.18-194.el5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.