Unexpected reboot

Latest response

Hi there!

I am facing a critical issue with my most important customer and seeking your help to investigate this.

My client is running RHEL 7 on a Lenovo SR630, and he is experiencing unexpected reboots of the servers (happened on the 3 of them)

I can't seem to find much in the /var/log/messages when this is happening, except logs from the boot.

I guess it could be related to:
1) HW issue
2) overheating
3) SW issue (I guess something in the kernel like DPDK)

Could you tell me how to investigate those? which logs can I have a look at for each option?

I've attached the var/log/messages when this is happening (see the yellow highlight) and the rsyslog conmf, maybe we can modify this conf to have more logs?



Hello Support Qosmos,

Hope that you've already gone through similar articles which gives systematic approach in such cases. Otherwise, please check this article once: https://access.redhat.com/articles/206873

If you feel that it is because of hardware issues then better to reach the hardware vendor and seek further help. Sometimes hardware vendor would release firmware patches which may address such things.

It is good to configure kdump if not then analyze the core dump. You may refer this page on how to get it configured https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/kernel_administration_guide/kernel_crash_dump_guide

Hope this helps!