SUMMARY and ENQUIRY: Best and simplest way to detect hung rsyslog
Hi,
Recently, due to some special tests, I found out that rsyslog was hung. It looked like being fully operational, but it actually was not:
# systemctl status rsyslog -l
rsyslog.service - System Logging Service
Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2018-11-09 07:43:54 AEDT; 10 months 26 days ago
Docs: man:rsyslogd(8)
http://www.rsyslog.com/doc/
Main PID: 104448 (rsyslogd)
CGroup: /system.slice/rsyslog.service
104448 /usr/sbin/rsyslogd -n
No logs are being updated - syslog files were empty:
# cat /var/log/messages
# ls -als /var/log/messages
0 -rw----r--. 1 root root 0 Nov 25 2018 /var/log/messages
Log file was not opened by rsyslog:
# lsof /var/log/messages
Attempt to generate a syslog alert failed:
# logger Mytest
# ls -als /var/log/messages
0 -rw----r--. 1 root root 0 Nov 25 2018 /var/log/messages
I also monitored TCP dump for the network interface for any syslog (UDP/514) traffic:
# tcpdump -i eth0 -A udp and port 514
Resolution is to restart rsyslog daemon:
# systemctl restart rsyslog
… and test again:
# logger Mytest
# ls -als /var/log/messages
12 -rw----r--. 1 root root 10289 Nov 25 12:27 /var/log/messages
# lsof /var/log/messages
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 71139 root 16w REG 253,4 715686 259 /var/log/messages
I am trying to figure out if there is any other trick I could use for IT Operations to detect a hung or unresponsive rsyslog.
Regards,
Dusan Baljevic (amateur radio VK2COT)