Errors in sshd services of the master nodes in ARO cluster

Solution Verified - Updated -

Environment

  • Azure Red Hat OpenShift [ARO]
    • 4.x

Issue

  • Errors messages are seen in sshd journal logs every 5 seconds for the sshd service of master nodes.
kex_exchange_identification: read: Connection reset by peer Sep xx xx:xx:xx xx-aro-azure-xxxxx--xxx-master-0 sshd[xxxx]: error: kex_exchange_identification: read: Connection reset by peer Sep xx xx:xx:xx xx-aro-azure-xxxxx--xxx-master-0 sshd[xxxx]: error: kex_exchange_identification: read: Connection reset by peer Sep xx xx:xx:xx xx-aro-azure-xxxxx--xxx-master-0 sshd[xxxx]: error: 

Resolution

  • The error messages can be safely ignored as these are just heath probes on the internal load balancer.

Root Cause

  • ARO cluster creates an internal load balancer in the cluster resource group that has 3 load balancing rules, one for each master for ssh access to master nodes. This is what allows in particular ARO SREs to access master nodes in case of emergencies.

  • There are heath probes on this internal load balancer that will initiate connections to each master node on port 22 every 5 seconds without key or whatever, so no ssh connection will be established in the end, and then will interrupt the connection. This generates an interrupted connection message from master node perspective that is seen in the master node's sshd journal logs.

Diagnostic Steps

  • Check the journal logs for sshd service.
# journalctl -u sshd.service -f

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments