OS rebooted due to os stall.

Solution Unverified - Updated -

Issue

  • Our customer encountered an unexpected reboot problem.
  • From our investigation, we found a kernl panic occurred at that time and the panic was caused due to timeout of a timer monitoring by clusterpro software. The timeout occurs if cluserpro-related process, which runs the timer monitoring, stalls.
  • According to our vmcore analysis, several processes were waiting I/O. Perhaps, the clusterpro-related process became unable to work and reached timeout of the timer monitoring. System processes such as kjounald and pdflush were waiting I/O?
  • Process lists which ST is 'UN'
     PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
      645     83   1  ffff81023dcff7e0  UN   0.0       0      0  [kjournald]
     3297     83   1  ffff8101b6c8c7a0  UN   0.0       0      0  [pdflush]
     3426     83   1  ffff81008875e100  UN   0.0       0      0  [pdflush]
     5083      1   0  ffff81023bbe3080  UN   0.0    5904    696  syslogd
     6219   6218   1  ffff810232cbe0c0  UN   0.0   23824   1780  clpevent
    16845  16840   0  ffff8100850987e0  UN   0.0    3844    616  sadc
    29187  29183   1  ffff810110d3a0c0  UN   0.0   29696   4628  actlog_cpuload
                                        ^^

Environment

  • Red Hat Enterprise Linux 5.5
  • kernel-2.6.18-194.32.1.el5
  • CFQ I/O scheduler and usage of the IOPRIO_CLASS_IDLE priority

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.