ERROR: Install OpenShift on AWS with installer-provisioned infrastructure

Latest response

Tried to follow the instructions provided here https://cloud.redhat.com/openshift/install/aws/installer-provisioned on AWS.
it seemed that everything was going well but after an hour or so it failed with the following error.

Any help in getting working is greatly appreciated.
Is there a way to cleanup the components on AWS and start fresh?

----------------------------------------------------install debug log----------------------------------------------------------------------------------
DEBUG Still waiting for the cluster to initialize: Working towards 4.5.11: 87% complete, waiting on authentication
DEBUG Still waiting for the cluster to initialize: Cluster operator authentication is still updating
DEBUG Still waiting for the cluster to initialize: Multiple errors are preventing progress:
* Cluster operator machine-config is reporting a failure: Failed to resync 4.5.11 because: timed out waiting for the condition during waitForDaemonsetRollout: Daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1)
* Cluster operator monitoring is reporting a failure: Failed to rollout the stack. Error: running task Updating node-exporter failed: reconciling node-exporter DaemonSet failed: updating DaemonSet object failed: waiting for DaemonSetRollout of node-exporter: daemonset node-exporter is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1)
INFO Cluster operator authentication Progressing is True with _WellKnownNotReady: Progressing: got '404 Not Found' status while trying to GET the OAuth well-known https://10.0.136.145:6443/.well-known/oauth-authorization-server endpoint data
INFO Cluster operator authentication Available is False with :
INFO Cluster operator dns Progressing is True with Reconciling: At least 1 DNS DaemonSet is progressing.
ERROR Cluster operator etcd Degraded is True with NodeController_MasterNodesReady: NodeControllerDegraded: The master nodes not ready: node "ip-10-0-136-145.us-west-2.compute.internal" not ready since 2020-09-22 09:15:52 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
INFO Cluster operator insights Disabled is False with :
ERROR Cluster operator kube-apiserver Degraded is True with NodeController_MasterNodesReady: NodeControllerDegraded: The master nodes not ready: node "ip-10-0-136-145.us-west-2.compute.internal" not ready since 2020-09-22 09:15:52 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
INFO Cluster operator kube-apiserver Progressing is True with NodeInstaller: NodeInstallerProgressing: 1 nodes are at revision 5; 2 nodes are at revision 6
ERROR Cluster operator kube-controller-manager Degraded is True with NodeController_MasterNodesReady: NodeControllerDegraded: The master nodes not ready: node "ip-10-0-136-145.us-west-2.compute.internal" not ready since 2020-09-22 09:15:52 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
ERROR Cluster operator kube-scheduler Degraded is True with NodeController_MasterNodesReady: NodeControllerDegraded: The master nodes not ready: node "ip-10-0-136-145.us-west-2.compute.internal" not ready since 2020-09-22 09:15:52 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
ERROR Cluster operator machine-config Degraded is True with MachineConfigDaemonFailed: Failed to resync 4.5.11 because: timed out waiting for the condition during waitForDaemonsetRollout: Daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1)
INFO Cluster operator machine-config Available is False with : Cluster not available for 4.5.11
INFO Cluster operator monitoring Available is False with :
INFO Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack.
ERROR Cluster operator monitoring Degraded is True with UpdatingnodeExporterFailed: Failed to rollout the stack. Error: running task Updating node-exporter failed: reconciling node-exporter DaemonSet failed: updating DaemonSet object failed: waiting for DaemonSetRollout of node-exporter: daemonset node-exporter is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1)
ERROR Cluster operator network Degraded is True with RolloutHung: DaemonSet "openshift-multus/multus" rollout is not making progress - last change 2020-09-22T09:15:52Z
DaemonSet "openshift-sdn/ovs" rollout is not making progress - last change 2020-09-22T09:15:53Z
DaemonSet "openshift-sdn/sdn" rollout is not making progress - last change 2020-09-22T09:15:52Z
INFO Cluster operator network Progressing is True with Deploying: DaemonSet "openshift-multus/multus" is not available (awaiting 1 nodes)
DaemonSet "openshift-sdn/ovs" is not available (awaiting 1 nodes)
DaemonSet "openshift-sdn/sdn" is not available (awaiting 1 nodes)
ERROR Cluster operator openshift-apiserver Degraded is True with APIServerDeployment_UnavailablePod: APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver
FATAL failed to initialize the cluster: Multiple errors are preventing progress:
* Cluster operator machine-config is reporting a failure: Failed to resync 4.5.11 because: timed out waiting for the condition during waitForDaemonsetRollout: Daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1)
* Cluster operator monitoring is reporting a failure: Failed to rollout the stack. Error: running task Updating node-exporter failed: reconciling node-exporter DaemonSet failed: updating DaemonSet object failed: waiting for DaemonSetRollout of node-exporter: daemonset node-exporter is not ready. status: (desired: 6, updated: 6, ready: 5, unavailable: 1)

Attachments

Responses

Looks like it's failing to bring up the 6th node in your cluster.

Probably turning on CloudTrail would be a good idea as the logs will give you insight into what's happening from an API point of view.

If it's a new AWS account, you will likely need to request an ec2 resource increase; you may need to request other resource limit increases as well.

Thank you, Stephen. That was it . I had to request for increase in EIPs and was successfully able to run the installer and create the cluster.