systemd services are killed by watchdog when resuming from suspend/s2idle
Environment
- Red Hat Enterprise Linux 8
- Suspend to Idle (s2i, s0ix, "Modern Standby") system
Issue
- systemd services are killed by watchdog when resuming from suspend/s2idle
- logs show systemd services
Failed with result 'watchdog'.
Resolution
Until this issue can be resolved, Red Hat recommends you switch the system's default 'sleep' mode from the new "Suspend to Idle" method to the older "S3" method by passing the following parameter on the kernel command line: mem_sleep_default=deep
Root Cause
With the new Suspend to Idle sleep method, it's possible for many interrupts to wake the kernel, but not require the system to fully wake from suspend. If this happens the kernel's MONOTONIC clock will restart. Upon wake, when systemd sees the clock has advanced beyond the 3 minute watchdog timeout, it will kill and restart services that have the watchdog enabled.
Diagnostic Steps
The symptoms of this can surface in a couple different ways.
Sometimes it will show as a service being killed via the watchdog, othertimes there will be a coredump for a systemd service.
kernel: PM: suspend entry (s2idle)
...
systemd[1]: systemd-journald.service: Main process exited, code=killed, status=6/ABRT
systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
systemd[1]: systemd-journald.service: Killing process 840 (systemd-journal) with signal SIGABRT.
In most instances there is a precursor error that shows some other device not suspending or entering low power mode correctly. That is a broad topic with too many possibilities to document them all here.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments