OS rebooted due to os stall.
Issue
- Our customer encountered an unexpected reboot problem.
- From our investigation, we found a kernl panic occurred at that time and the panic was caused due to timeout of a timer monitoring by clusterpro software. The timeout occurs if cluserpro-related process, which runs the timer monitoring, stalls.
- According to our vmcore analysis, several processes were waiting I/O. Perhaps, the clusterpro-related process became unable to work and reached timeout of the timer monitoring. System processes such as kjounald and pdflush were waiting I/O?
- Process lists which ST is 'UN'
PID PPID CPU TASK ST %MEM VSZ RSS COMM
645 83 1 ffff81023dcff7e0 UN 0.0 0 0 [kjournald]
3297 83 1 ffff8101b6c8c7a0 UN 0.0 0 0 [pdflush]
3426 83 1 ffff81008875e100 UN 0.0 0 0 [pdflush]
5083 1 0 ffff81023bbe3080 UN 0.0 5904 696 syslogd
6219 6218 1 ffff810232cbe0c0 UN 0.0 23824 1780 clpevent
16845 16840 0 ffff8100850987e0 UN 0.0 3844 616 sadc
29187 29183 1 ffff810110d3a0c0 UN 0.0 29696 4628 actlog_cpuload
^^
Environment
- Red Hat Enterprise Linux 5.5
- kernel-2.6.18-194.32.1.el5
- CFQ I/O scheduler and usage of the IOPRIO_CLASS_IDLE priority
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
