[semi-resolved] Red Hat 7.x ssh fails with error "System is booting up. See pam_nologin(8) \ Authentication Failed." (but system is booted)

Latest response

This is not a Red Hat solution, this is the Red Hat discussion area. I'm documenting this here because I've had at minimum 3 systems have this issue and want to document it for others who might be experiencing this (or for myself or team members when I google it).

Environment

  • Red Hat Enterprise Linux 7.x Server

Note: make a comment in this discussion if you encounter this on a system that is not RHEL 7


Issue

  • ssh to a RHEL 7 server fails with the following erroneous error in the below block:
# your ip address or hostname will obviously differ
[you@yoursystem] # ssh 192.168.100.100
System is booting up. See pam_nologin(8)
Authentication failed.


  • The system you are attempting to ssh to is actually not in the boot process.

  • The user you are sshing to is not a nologin account


Resolution

  • IMPORTANT Really know that this system is not in the process of booting. If it is not then manually log into the system, become root and remove /run/nologin file

  • Log into the system using the console of the computer (physical or virtual, or web interface for a console). Then remove the /run/nologin file.

  • Virtual system example, if it is a VMware system, use the VMware interface to log in and perform this.

  • Visit the system and log into the console if it is a physical server.

  • If you have a remote management tool (Dell for example has "iDrac"), use it to gain access to your system

  • Some systems might have Red Hat Cockpit installed and running, use it if possible to attain terminal access.

  • Again, once you are logged in, switch to the root account and remove the /run/nologin file

  • Additionally check for the following files and remove if they exist:

ls  /{var/run,etc,run}/nologin && rm  /{var/run,etc,run}/nologin


Root Cause


Request any Red Hatters or anyone else post anything relevant on this topic (especially if there is a source solution or article from redhat).

Regards

RJ

Responses

Hello! I have the same issue. Do you have any result examining that problem?

What does 'systemctl status' report? Is the system in a 'degraded' state, with 1 or more services "failed"? ("systemctl list-units |grep failed' might be helpful here).

I do not know exactly which of the many systemd units is repsonsible for removing the 'nologin' file (either /etc/nologin or /run/nologin), but hunting down that service and its dependencies would be my approach to solving the original problem.

Andrew, I wish I had more info as to why. Other priority production things have gotten in the way of me digging further into it.

James, the systemctl status command is useful, nice tip above, thanks. I can't say there is a correpsonding failed service with this overall issue I originally posted.

Regards

RJ