System falls in to emergency mode due initrd-switch-root.service entered failed state.

Solution Verified - Updated -

Red Hat Insights can detect this issue

Proactively detect and remediate issues impacting your systems.
View matching systems and remediation

Environment

  • Red Hat Enterprise Linux (RHEL) 7.8 (after upgrade from a previous version)
  • systemd

Issue

  • The below service fails at boot time causing the system to fall in to emergency mode.

    # systemctl status initrd-switch-root.service
    ● initrd-switch-root.service - Switch Root
       Loaded: loaded (/usr/lib/systemd/system/initrd-switch-root.service; static; vendor preset: disabled)
       Active: failed (Result: signal) since Fri 2020-04-17 14:36:17 CEST; 5min ago
      Process: 502 ExecStart=/usr/bin/systemctl --no-block --force switch-root /sysroot (code=killed, signal=TERM)
     Main PID: 502 (code=killed, signal=TERM)
    
  • After upgrade from 7.x to 7.8, machine halts at emergency prompt.

Resolution

Follow the procedure in the Diagnostic Steps section. If this is a match, proceed further.

You need to rebuild all the initramfs images. This is done on the live system while at the emergency prompt, or from Rescue DVD while in the chroot.

Case where the system is booted using the DVD in Troubleshooting mode

  1. Enter the chroot

    # chroot /mnt/sysimage
    
  2. Backup all the initramfs images to persistent storage, e.g. in /root

    # cp /boot/initramfs-*.x86_64.img /root
    
  3. Recreate all the initramfs images

    # dracut --force --regenerate-all
    

    Note: it is recommended to rebuild all initramfs images otherwise you may hit the issue while booting an older kernel.

  4. Exit the chroot

    # exit
    
  5. Reboot on the disk by exiting the Troubleshooting mode

    # exit
    

Case where the system is booted on the disk and emergency prompt was entered

  1. Backup all the initramfs images to persistent storage, e.g. in /root

    # cp /boot/initramfs-*.x86_64.img /root
    
  2. Recreate the initramfs images using the following command

    # dracut --force --regenerate-all
    

    Note: it is recommended to rebuild all initramfs images otherwise you may hit the issue while booting an older kernel.

  3. Continue booting

    # exit
    

In order to prevent hitting the issue you can proceed with following steps right after update to RHEL 7.8 (before any subsequent reboot)

  1. Backup all the initramfs images to persistent storage, e.g. in /root

    # cp /boot/initramfs-*.x86_64.img /root
    
  2. Recreate the initramfs images using the following command

    # dracut --force --regenerate-all
    

    Note: it is recommended to rebuild all initramfs images otherwise you may hit the issue while booting an older kernel.

Root Cause

The issue is due to a timing issue upon switching root when the initramfs contains an old systemd binary:
1. The old systemd binary starts switching root by executing initrd-switch-root.service unit which internally executes old systemctl program
2. The new systemd binary on the root file system sends a SIGTERM to initrd-switch-root.service unit while old systemctl program was still executing
3. Due to old systemctl program not having the fix for BZ 1825232 - System drops into emergency mode for no obvious reason after upgrading to latest systemd, the old systemctl program fails
4. The new systemd binary sees the initrd-switch-root.service failed and fires the OnFailure command (emergency.target).

Diagnostic Steps

  1. Check the size of the systemd binary installed on the system

    # ls -l /usr/lib/systemd/systemd
    -rwxr-xr-x. 1 root root 1628536 Mar 17 10:50 /usr/lib/systemd/systemd
    
  2. Check the size of the systemd binary embedded in the initramfs which enters Emergency mode

    # lsinitrd /boot/initramfs-$(uname -r).img | grep "usr/lib/systemd/systemd$"
    lrwxrwxrwx   1 root     root           23 Apr 17 14:18 init -> usr/lib/systemd/systemd
    -rwxr-xr-x   1 root     root      1620416 Apr 17 14:18 usr/lib/systemd/systemd
    

In the example above, the sizes differ, indicating that a different systemd is running in the initramfs, which may cause the issue described in this solution if systemd is older than systemd-219-70.el7.

Another method with sosreport:

  1. Check if the installed systemd package is systemd-219-73.el7 or later.

    $ cat sosreport/installed-rpms | grep systemd-219
    systemd-219-73.el7_8.5.x86_64                               Tue Apr 14 19:36:55 2020    1586860615  Red Hat, Inc.
    
  2. Check if creation time of /boot/initramfs-$(unamr -r).img file is older than above installation time of systemd package.

    $ cat sosreport/uname | awk '{print $3}'
    3.10.0-123.el7.x86_64
    $ cat sosreport/sos_commands/boot/ls_-lanR_.boot | grep initramfs-3.10.0-123.el7.x86_64.img
    -rw-------.  1 0 0 16944810 Sep 27  2017 initramfs-3.10.0-123.el7.x86_64.img
    
  3. Check if the last erased systemd package is older than systemd-219-73.el7.

    $ cat sosreport/var/log/yum.log | grep systemd-2 | tail -n1
    Apr 15 16:42:12 Erased: systemd-208-11.el7.x86_64
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

11 Comments

The defect was present since a long time, but never triggered before updating to systemd-219-71.el7 or later. --> The previous systemd version does not seem to have this problem. It may be introduced by the following commit ? https://github.com/systemd/systemd/commit/1f0958f640b87175cd547c1e69084cfe54a22e9d

The defect was present since a long time, but never triggered before updating to systemd-219-71.el7 or later. --> The previous systemd version does not seem to have this problem. It may be introduced by the following commit ? (https://github.com/systemd/systemd/commit/1f0958f640b87175cd547c1e69084cfe54a22e9d)

The defect was present since a long time, but never triggered before updating to systemd-219-71.el7 or later. --> The previous systemd version does not seem to have this problem. It may be introduced by the following commit ? (https://github.com/systemd/systemd/commit/1f0958f640b87175cd547c1e69084cfe54a22e9d)

The defect was present since a long time, but never triggered before updating to systemd-219-71.el7 or later.

The previous systemd version does not seem to have this problem. It may be introduced by the following commit ? (https://github.com/systemd/systemd/commit/1f0958f640b87175cd547c1e69084cfe54a22e9d)

The defect was present since a long time, but never triggered before updating to systemd-219-71.el7 or later.

The previous systemd version does not seem to have this problem. It may be introduced by the following commit ? (https://github.com/systemd/systemd/commit/1f0958f640b87175cd547c1e69084cfe54a22e9d)

The defect was present since a long time, but never triggered before updating to systemd-219-71.el7 or later.

The previous systemd version does not seem to have this problem. It may be introduced by the following commit

The previous systemd version does not seem to have this problem. It may be introduced by the following commit

The previous systemd version does not seem to have this problem. It may be introduced by the following commit

Yes, it's very likely this commit that introduced the issue. If was introduced in the following systemd releases:

  • systemd-219-69

Sorry for submitting so many duplicate comments because of a bug in the webpage.

Regenerating initramfs may introduce some unknown risks. The following patch may fix it without regenerating iniramfs: https://github.com/systemd-rhel/rhel-7/pull/117