How to distinguish between a crash and a graceful reboot in RHEL 7

Updated -

How can you distinguish between a system crash and a graceful reboot or shutdown in RHEL 7? This article outlines 4 approaches:

  1. Inspect wtmp with last -x
  2. Inspect auditd logs with ausearch
  3. Create a custom service unit
  4. Inspect previous boots with journalctl

(1) Inspect wtmp with last -x

With a simple last -n2 -x shutdown reboot command, the system wtmp file reports the two most recent shutdowns or reboots. reboot denotes the system booting up; whereas, shutdown denotes the system going down. So a graceful shutdown would show up as reboot preceded by shutdown, as in the following example:

~]# last -n2 -x shutdown reboot
reboot   system boot  3.10.0-327.el7.x Tue Sep 20 01:22 - 01:22  (00:00)    
shutdown system down  3.10.0-327.el7.x Tue Sep 20 01:21 - 01:21  (00:00)    

In contrast, an ungraceful shutdown can be inferred by the omission of shutdown; instead there will be 2 reboot lines in a row, as in this example:

~]# last -n2 -x shutdown reboot
reboot   system boot  3.10.0-327.el7.x Tue Sep 20 01:11 - 01:20  (00:08)    
reboot   system boot  3.10.0-327.el7.x Tue Sep 20 01:10 - 01:20  (00:09)    

(2) Inspect auditd logs with ausearch

auditd is amazing and all the different events that it logs can be seen by checking ausearch -m. Apropos to the problem at hand, it logs system shutdown and system boot as above. The command ausearch -i -m system_boot,system_shutdown | tail -4 will report the 2 most recent shutdowns or boots. If this reports a SYSTEM_SHUTDOWN followed by a SYSTEM_BOOT, all is well; however, if it reports 2 SYSTEM_BOOT lines in a row, then clearly the system did not shutdown gracefully, as in the following example:

~]# ausearch -i -m system_boot,system_shutdown | tail -4
----
type=SYSTEM_BOOT msg=audit(09/20/2016 01:10:32.392:7) : pid=657 uid=root auid=unset ses=unset subj=system_u:system_r:init_t:s0 msg=' comm=systemd-update-utmp exe=/usr/lib/systemd/systemd-update-utmp hostname=? addr=? terminal=? res=success' 
----
type=SYSTEM_BOOT msg=audit(09/20/2016 01:11:41.134:7) : pid=656 uid=root auid=unset ses=unset subj=system_u:system_r:init_t:s0 msg=' comm=systemd-update-utmp exe=/usr/lib/systemd/systemd-update-utmp hostname=? addr=? terminal=? res=success' 

(3) Create a custom service unit

This approach is great because it allows for complete control. Here's an example of how to do it.

  1. Create a service that runs only at shutdown
    (Optionally customize the service name and the graceful_shutdown file)

    ~]# cat /etc/systemd/system/set_gracefulshutdown.service
    [Unit]
    Description=Set flag for graceful shutdown
    DefaultDependencies=no
    RefuseManualStart=true
    Before=shutdown.target
    
    [Service]
    Type=oneshot
    ExecStart=/bin/touch /root/graceful_shutdown
    
    [Install]
    WantedBy=shutdown.target
    
    ~]# systemctl daemon-reload
    ~]# systemctl enable set_gracefulshutdown
    
  2. Create a service that runs only at startup and only IF the graceful_shutdown file created by the above service exists
    (Optionally customize the service name and ensure the graceful_shutdown file matches the above service)

    ~]# cat /etc/systemd/system/check_graceful.service
    [Unit]
    Description=Check if previous system shutdown was graceful
    ConditionPathExists=/root/graceful_shutdown
    RefuseManualStart=true
    RefuseManualStop=true
    
    [Service]
    Type=oneshot
    RemainAfterExit=true
    ExecStart=/bin/rm /root/graceful_shutdown
    
    [Install]
    WantedBy=multi-user.target
    
    ~]# systemctl daemon-reload
    ~]# systemctl enable check_graceful
    
  3. Any time after a graceful reboot, systemctl is-active check_graceful would be able to confirm the previous reboot was graceful.
    Example output:

    ~]# systemctl is-active check_graceful && echo GOOD || echo BAD
    active
    GOOD
    
    ~]# systemctl status check_graceful
     check_graceful.service - Check if system booted after a graceful shutdown
       Loaded: loaded (/etc/systemd/system/check_graceful.service; enabled; vendor preset: disabled)
       Active: active (exited) since Tue 2016-09-20 01:10:32 EDT; 20s ago
      Process: 669 ExecStart=/bin/rm /root/graceful_shutdown (code=exited, status=0/SUCCESS)
     Main PID: 669 (code=exited, status=0/SUCCESS)
       CGroup: /system.slice/check_graceful.service
    
    Sep 20 01:10:32 a72.example.com systemd[1]: Starting Check if system booted after a graceful shutdown...
    Sep 20 01:10:32 a72.example.com systemd[1]: Started Check if system booted after a graceful shutdown.
    
  4. After a crash or otherwise ungraceful shutdown, the following would be seen:

    ~]# systemctl is-active check_graceful && echo GOOD || echo BAD
    inactive
    BAD
    
    ~]# systemctl status check_graceful
    ● check_graceful.service - Check if system booted after a graceful shutdown
       Loaded: loaded (/etc/systemd/system/check_graceful.service; enabled; vendor preset: disabled)
       Active: inactive (dead)
    Condition: start condition failed at Tue 2016-09-20 01:11:41 EDT; 16s ago
               ConditionPathExists=/root/graceful_shutdown was not met
    
    Sep 20 01:11:41 a72.example.com systemd[1]: Started Check if system booted after a graceful shutdown.
    

(4) Inspect previous boots with journalctl

  1. Configure systemd-journald to keep a persistent journal on-disk

    ~]# mkdir /var/log/journal
    ~]# systemctl -s SIGUSR1 kill systemd-journald
    ~]# reboot
    
  2. Use journalctl -b -1 -n to look at the last few (10 by default) lines of the previous boot
    (Note that -b -2 is the boot before that, etc.)
    The following example output shows that the previous system reboot was graceful

    ~]# journalctl -b -1 -n
    -- Logs begin at Tue 2016-09-20 01:01:15 EDT, end at Tue 2016-09-20 01:21:33 EDT. --
    Sep 20 01:21:19 a72.example.com systemd[1]: Stopped Create Static Device Nodes in /dev.
    Sep 20 01:21:19 a72.example.com systemd[1]: Stopping Create Static Device Nodes in /dev...
    Sep 20 01:21:19 a72.example.com systemd[1]: Reached target Shutdown.
    Sep 20 01:21:19 a72.example.com systemd[1]: Starting Shutdown.
    Sep 20 01:21:19 a72.example.com systemd[1]: Reached target Final Step.
    Sep 20 01:21:19 a72.example.com systemd[1]: Starting Final Step.
    Sep 20 01:21:19 a72.example.com systemd[1]: Starting Reboot...
    Sep 20 01:21:19 a72.example.com systemd[1]: Shutting down.
    Sep 20 01:21:19 a72.example.com systemd-shutdown[1]: Sending SIGTERM to remaining processes...
    Sep 20 01:21:19 a72.example.com systemd-journal[483]: Journal stopped
    

Note from the author: In my experience, this is not perfect. When bad things happen, I've seen the indexing in journald screw up to where the journalctl -b -1 command only gives an error.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.