Server Hung

Latest response

Sever is hung not able to do df -h

Responses

Hi Sarath,

Shutdown the server, restart the server and check if the problem is gone. If it is, good - if not, reboot into recovery mode and run xfs_repair. If that also doesn't solve your issue, boot from a GParted USB drive, and check if a partition has not enough free space. If that's the case, remove some files and check again. As I don't know what happened before, these are suggestions. :)

Regards,
Christian

It is NFS file system where we are not able to do df -h .Reboot will fix the issue. But we need to with out that..it is clusterd server.

Well Sarath, you have two options - either live with the problem, or plan a scheduled server down time ... :)

Regards,
Christian

Hi Sarath,

Your one-line descriptor would not help much here for anyone to guess and provide suggestion/solutions to the issue. Christian as already provided some general hints/suggestion.

Please provide as much details as possible such as :

- Since when the issue started?
- Any changes done to the system before the issue started? 
- What release, kernel version, NFS version being used on the server side and client side? 
- Is this issue (server hung) being noticed on NFS server side or client side ? (I guess as per your very little description, I presume it should be NFS server).
- Have you checked for any errors/warning/failed/nfs related notifications in /var/log/messages? 
- Have you done any investigations in an effort to fix this issue? Please describe and elaborate. 
- etc.... 

Those are some of the questions that you would required to provide an answer and also include other observations you've made. Also, keep in mind that this is community based support and if this is critical system and you've contract with Red Hat then log a case, so that the issue gets addressed on priority.

All the best!

Hi Sadashiva,

Yes, you are right ... providing more details would be "not a bad idea", Sarath ! :) You may want to read a related
post from my friend and Red Hat Accelerators buddy, RJ Hinton : Posting tips for the Red Hat Discussion Area

Regards,
Christian

Feb 10 03:31:53 gv1hqpdb77i Server_Administrator: 7530 2095 - Storage Service Unexpected sense. SCSI sense data: Sense key: 3 Sense code: 11 Sense qualifier: 0: Physical Disk 0:1:24 Controller 0, Connector 0 Feb 10 17:35:01 gv1hqpdb77i kernel: CIFS VFS: Server gv1hqpfs02 has not responded in 120 seconds. Reconnecting... This has caused kernel panic in our servers. Feb 10 17:35:01 gv1hqpdb77i kernel: CIFS VFS: Server gv1hqpfs02 has not responded in 120 seconds. Reconnecting... Feb 10 17:38:12 gv1hqpdb77i kernel: INFO: task vertica:17580 blocked for more than 120 seconds. Feb 10 17:38:12 gv1hqpdb77i kernel: Not tainted 2.6.32-754.27.1.el6.x86_64 #1 Feb 10 17:38:12 gv1hqpdb77i kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 10 17:38:12 gv1hqpdb77i kernel: vertica D 0000000000000009 0 17580 1 0x00000080 Feb 10 17:38:12 : Due to kernel panic , system has put all the running process in Dead state / Uninterruptable state – that includes Zabbix agent & BES Client. This is why we couldn’t stop the Zabbix agent when tried to stop. Feb 10 17:38:12 gv1hqpdb77i kernel: [] user_statfs+0x47/0xb0 Feb 10 17:38:12 gv1hqpdb77i kernel: [] sys_statfs+0x2a/0x50 Feb 10 17:48:12 gv1hqpdb77i kernel: INFO: task zabbix_agentd:391170 blocked for more than 120 seconds. Feb 10 17:48:12 gv1hqpdb77i kernel: zabbix_agentd D 000000000000000e 0 391170 6945 0x00000084

When we checked Feb 6 system logs, when the similar issue occurred on some of the vertica servers , the symptoms are same. Feb 6 03:39:01 gv1hqpdb77h freshclam[418844]: bytecode.cld is up to date (version: 331, sigs: 94, f-level: 63, builder: anvilleg) Feb 6 06:23:32 gv1hqpdb77h kernel: CIFS VFS: Server gv1hqpcifs12 has not responded in 120 seconds. Reconnecting... Feb 6 06:26:22 gv1hqpdb77h kernel: INFO: task vertica:30761 blocked for more than 120 seconds. Feb 6 06:26:22 gv1hqpdb77h kernel: Not tainted 2.6.32-754.25.1.el6.x86_64 #1 Feb 6 06:26:22 gv1hqpdb77h kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Hi Sarath,

Thanks for providing some further information ... I'm gonna stick with my recommendation to reboot the server. :)

Regards,
Christian

Hi Sarath,

Forgot to mention : You are still using RHEL 6 which will reach EOL soon, I suggest a migration to RHEL 7 or 8 ... :)

Regards,
Christian