OpenShift AWS private cluster installation fails with 'connect: connection refused'

Solution Verified - Updated -

Issue

  • The openshift-install tool with the debug logs enabled, the install fails with the following error:
# openshift-install create cluster --log-level=debug
...
DEBUG Built from commit 310205b3cee9e166544b882e8ea8b321af198b6f 
INFO Waiting up to 20m0s for the Kubernetes API at https://api.ocp4.example.com:6443... 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.12:6443: i/o timeout 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.12:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.12:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.2.4:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.12:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.2.4:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.12:6443: connect: connection refused
INFO Pulling debug logs from the bootstrap machine 
DEBUG Added /tmp/bootstrap-ssh976485374 to installer's internal agent 
DEBUG Added /home/user/.ssh/id_rsa to installer's internal agent 
ERROR Attempted to gather debug logs after installation failure: failed to create SSH client: dial tcp 192.168.1.4:22: connect: connection refused 
FATAL Bootstrap failed to complete: failed waiting for Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.12:6443: i/o timeout 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.2.4:6443: i/o timeout 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com.com:6443/version?timeout=32s: dial tcp 192.168.1.11:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com.com:6443/version?timeout=32s: dial tcp 192.168.2.4:6443: connect: connection refused 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.example.com:6443/version?timeout=32s: dial tcp 192.168.1.11:6443: connect: connection refused 

  • The bootstrap machine boot logs shows many tries to download the bootstrap.ign file from an AWS S3 resource and system gets infinite reboot loop : (select AWS Console > EC2 > EC2 Instance > Instance Settings > Get System Log in order to get instances logs):
[OK] Reached target Remote File Systems (Pre).
[OK] Reached target Remote File Systems.
[10.820527] systemd[1]: Starting dracut pre-mount hook...
[10.824952] systemd[1]: Reached target Remote File Systems (Pre).
[OK] Started dracut pre-mount hook.
[10.839883] systemd[1]: Reached target Remote File Systems.
[10.845620] systemd[1]: Started dracut pre-mount hook.
[10.974167] ignition[820]: GET http://169.254.169.254/2009-04-04/user-data: attempt #5
[10.982529] ignition[820]: GET result: OK
[*] A start job is running for Ignition (fetch) (11s / no limit)
[**] A start job is running for Ignition (fetch) (11s / no limit)
[**] A start job is running for Ignition (fetch) (12s / no limit)
[**] A start job is running for Ignition (fetch) (13s / no limit)
[**] A start job is running for Ignition (fetch) (13s / no limit)
...
[**  ] A start job is running fo[?25l[m[H[J[1;1H[20;7H[mUse the ^ and v keys to change the selection.  

Press 'e' to edit the selected item, or 'c' for a command prompt.[4;80H [7m[4;1HRed Hat Enterprise Linux CoreOS 45.82.202008010929-0 (Ootpa) (ostree:0)  [m[4;79H[m[m[5;1H [m[5;79H[m[m[6;1H [m[6;79H[m[m[7;1H [m[7;79H[m[m[8;1H [m[8;79H[m[m[9;1H [m[9;79H[m[m[10;1H [m[10;79H[m[m[11;1H [m[11;79H[m[m[12;1H [m[12;79H[m[m[13;1H [m[13;79H[m[m[14;1H [m[14;79H[m[m[15;1H [m[15;79H[m[m[16;1H [m[16;79H[m[m[17;1H [m[17;79H[m[m[18;1H [m[18;79H[m[18;80H [4;79H[22;1HThe selected entry will be started automatically in 1s.[4;79H[22;1HThe selected entry will be started automatically in 0s.[4;79H[?25h[H[J[1;1H[H[J[1;1H[ 0.000000] Linux version 4.18.0-193.14.3.el8_2.x86_64 (mockbuild@x86-vm-07.build.eng.bos.redhat.com) (gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)) #1 SMP Mon Jul 20 15:02:29 UTC 2020
[ 0.000000] Command line: BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-cfdd324ec49fa0c33c9f0391807b6a35cdac3221f5a2f95c99598b9f3974b9c4/vmlinuz-4.18.0-193.14.3.el8_2.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ignition.firstboot rd.neednet=1 ip=dhcp,dhcp6 ostree=/ostree/boot.1/rhcos/cfdd324ec49fa0c33c9f0391807b6a35cdac3221f5a2f95c99598b9f3974b9c4/0 ignition.platform.id=aws
...

Rebooting.
[  432.383227] systemd[1]: Shutting down.
[  432.394910] printk: systemd-shutdow: 28 output lines suppressed due to ratelimiting
[  432.455090] systemd-shutdown[1]: Syncing filesystems and block devices.
[  432.459690] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
[  432.465990] systemd-journald[924]: Received SIGTERM from PID 1 (systemd-shutdow).
[  432.475134] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
[  432.481650] systemd-shutdown[1]: Unmounting file systems.
[  432.486458] [943]: Remounting '/' read-only in with options 'size=4039748k,nr_inodes=1009937'.
[  432.493551] systemd-shutdown[1]: All filesystems unmounted.
[  432.497606] systemd-shutdown[1]: Deactivating swaps.
[  432.501373] systemd-shutdown[1]: All swaps deactivated.
[  432.505116] systemd-shutdown[1]: Detaching loop devices.
[  432.509011] systemd-shutdown[1]: All loop devices detached.
[  432.512954] systemd-shutdown[1]: Detaching DM devices.
[  432.516688] systemd-shutdown[1]: All DM devices detached.
[  432.520546] systemd-shutdown[1]: All filesystems, swaps, loop devices and DM devices detached.
[  432.527191] systemd-shutdown[1]: Syncing filesystems and block devices.
[  432.531799] systemd-shutdown[1]: Rebooting.
[  432.535922] xenbus: xenbus_dev_shutdown: device/pci/0: Initialising != Connected, skipping
[  432.578103] reboot: Restarting system
[  432.581430] reboot: machine restart
  • The bootstrap machine is unable to connect by SSH and fails with the following error:
# ssh ip-xxx-xxx-xx-xx.example.internal (Private IPv4 DNS)
ssh: connect to host ip-xxx-xxx-xx-xx.example.internal port 22: Connection refused

Environment

  • Red Hat Openshift Container Platform 4.5 (OCP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content