SUMMARY and ENQUIRY: Best and simplest way to detect hung rsyslog

Latest response

Hi,

Recently, due to some special tests, I found out that rsyslog was hung. It looked like being fully operational, but it actually was not:

# systemctl status rsyslog -l
   rsyslog.service - System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2018-11-09 07:43:54 AEDT; 10 months 26 days ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
Main PID: 104448 (rsyslogd)
   CGroup: /system.slice/rsyslog.service
           104448 /usr/sbin/rsyslogd -n

No logs are being updated - syslog files were empty:

# cat /var/log/messages

# ls -als /var/log/messages
0 -rw----r--. 1 root root 0 Nov 25  2018 /var/log/messages

Log file was not opened by rsyslog:

# lsof /var/log/messages

Attempt to generate a syslog alert failed:

# logger Mytest

# ls -als /var/log/messages
0 -rw----r--. 1 root root 0 Nov 25  2018 /var/log/messages

I also monitored TCP dump for the network interface for any syslog (UDP/514) traffic:

# tcpdump -i eth0 -A udp and port 514

Resolution is to restart rsyslog daemon:

# systemctl restart rsyslog

… and test again:

# logger Mytest

# ls -als /var/log/messages
12 -rw----r--. 1 root root 10289 Nov 25 12:27 /var/log/messages

# lsof /var/log/messages
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 71139 root   16w   REG  253,4   715686  259 /var/log/messages

I am trying to figure out if there is any other trick I could use for IT Operations to detect a hung or unresponsive rsyslog.

Regards,

Dusan Baljevic (amateur radio VK2COT)

Responses