Trouble installing OpenShift v3 on AWS and Openstack
Environment
- RedHat Enterprise OpenShift 3.0.X
Issue
- Ansible install is failing when installing on AWS or OpenStack
TASK: [openshift_manage_node | Wait for Node Registration] ********************
2015-09-18 18:58:29,493 p=2233 u=root | failed: [ose-master.example.com] => (item=ose-node2.example.com) => {"attempts": 10, "changed": true, "cmd": ["oc", "get", "node", "ose-node2.example.com"], "delta": "0:00:00.361069", "end": "2015-09-18 18:58:29.481295", "failed": true, "item": "ose-node2.example.com", "rc": 1, "start": "2015-09-18 18:58:29.120226", "warnings": []}
- Getting error after ansible install on openshift-node.service
Sep 20 16:07:53 ose-node2 openshift-node: I0920 16:07:53.671657 5339 kubelet.go:793] Unable to register ose-node2.example.com with the apiserver: Post https://ose-master.example.com:8443/api/v1/nodes: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:07:54 ose-node2 openshift-node: E0920 16:07:54.683793 5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
Sep 20 16:07:58 ose-node2 openshift-node: E0920 16:07:58.969617 5339 common.go:243] Could not find an allocated subnet for minion ose-node2.example.com: Get https://ose-master.example.com:8443/oapi/v1/hostsubnets/ose-node2.example.com: dial tcp 20.0.0.27:8443: i/o timeout. Waiting...
Sep 20 16:08:00 ose-node2 openshift-node: I0920 16:08:00.672089 5339 kubelet.go:1942] Recording NodeReady event message for node ose-node2.example.com
Sep 20 16:08:00 ose-node2 openshift-node: I0920 16:08:00.672145 5339 kubelet.go:790] Attempting to register node ose-node2.example.com
Sep 20 16:08:02 ose-node2 openshift-node: E0920 16:08:02.242390 5339 reflector.go:180] pkg/kubelet/kubelet.go:177: Failed to list *api.Service: Get https://ose-master.example.com:8443/api/v1/services: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:08:02 ose-node2 openshift-node: E0920 16:08:02.242397 5339 reflector.go:180] pkg/kubelet/kubelet.go:194: Failed to list *api.Node: Get https://ose-master.example.com:8443/api/v1/nodes?fieldSelector=metadata.name%3Dose-node2.example.com: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:08:02 ose-node2 openshift-node: E0920 16:08:02.242472 5339 reflector.go:180] pkg/kubelet/config/apiserver.go:43: Failed to list *api.Pod: Get https://ose-master.example.com:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dose-node2.example.com: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:08:04 ose-node2 openshift-node: E0920 16:08:04.685062 5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
Sep 20 16:08:14 ose-node2 openshift-node: E0920 16:08:14.686336 5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
Sep 20 16:08:22 ose-node2 openshift-node: E0920 16:08:22.049729 5339 event.go:194] Unable to write event: 'Post https://ose-master.example.com:8443/api/v1/namespaces/default/events: dial tcp 20.0.0.27:8443: i/o timeout' (may retry after sleeping)
Sep 20 16:08:24 ose-node2 openshift-node: E0920 16:08:24.687616 5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
-
During the install Ansible, the hostname name changes to an unwanted value
-
Ansible install is using the wrong hostname and IP addresses
Resolution
-
When installing OpenShift v3 on AWS or OpenStack make sure security groups are configured to allow access from the outside to the following.
- TCP/22 - ssh
- TCP/80 - Web Apps
- TCP/443 - Web Apps (https)
- UDP/4789 - SDN / VXLAN
- TCP/8443 - Openshift Console
- TCP/10250 - kubelet
-
Hostnames and IP addresses should be defined for master and nodes in the
/etc/ansible/host
file.- openshift_ip
- openshift_public_ip
- openshift_hostname
- openshift_public_hostname
Example /etc/ansible/hosts
file
# Ansible 'hosts' file for basic 1 master 2 node install
[OSEv3:children]
masters
nodes
[OSEv3:vars]
ansible_ssh_user=root
#ansible_become=true
deployment_type=enterprise
product_type=openshift
[masters]
ose-master.example.com openshift_ip=16.0.0.240 openshift_public_ip=54.175.210.46 openshift_hostname=ose-master.example.com openshift_public_hostname=ose-master.example.com
[nodes]
ose-node2.example.com openshift_ip=16.0.0.120 openshift_public_ip=54.210.35.7 openshift_hostname=ose-node2.example.com openshift_public_hostname=ose-node2.example.com
ose-node1.example.com openshift_ip=16.0.0.179 openshift_public_ip=54.173.246.71 openshift_hostname=ose-node1.example.com openshift_public_hostname=ose-node1.example.com
ose-master.example.com openshift_ip=16.0.0.240 openshift_public_ip=54.175.210.46 openshift_hostname=ose-master.example.com openshift_public_hostname=ose-master.example.com openshift_scheduleable=False
- On each host disable cloud-init (optional)
# systemctl stop cloud-init-local.service cloud-init.service cloud-final.service cloud-config.service
# systemctl disable cloud-init-local.service cloud-init.service cloud-final.service cloud-config.service
# systemctl mask cloud-init-local.service cloud-init.service cloud-final.service cloud-config.service
- If failures still occur log a case with Red Hat Support and provide debug logs from your ansible playbook:
# ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml -vvv | tee ansible.logs
Note
Starting in 3.10, openshift_ip
and openshift_hostname
are deprecated. It is important to label the node in the hosts file with the same name used in the cloud provider's portal, and as visible in the hostname
command run on the host. See this related solution for more.
Diagnostic Steps
Overriding detected ip addresses and hostnames
- Some deployments will require that the user override the detected hostnames and ip addresses for the hosts. To see what the default values will be you can run the openshift_facts playbook:
# ansible-playbook playbooks/byo/openshift_facts.yml
The output will be similar to:
ok: [10.3.9.45] => {
"result": {
"ansible_facts": {
"openshift": {
"common": {
"hostname": "example-osev3-ansible-005dcfa6-27c6-463d-9b95-ef059579befd.os1.phx2.redhat.com",
"ip": "172.16.4.79",
"public_hostname": "example-osev3-ansible-005dcfa6-27c6-463d-9b95-ef059579befd.os1.phx2.redhat.com",
"public_ip": "10.3.9.45",
"use_openshift_sdn": true
},
"provider": {
... <snip> ...
}
}
},
"changed": false,
"invocation": {
"module_args": "",
"module_name": "openshift_facts"
}
}
}
ok: [10.3.9.42] => {
"result": {
"ansible_facts": {
"openshift": {
"common": {
"hostname": "example-osev3-ansible-c6ae8cdc-ba0b-4a81-bb37-14549893f9d3.os1.phx2.redhat.com",
"ip": "172.16.4.75",
"public_hostname": "example-osev3-ansible-c6ae8cdc-ba0b-4a81-bb37-14549893f9d3.os1.phx2.redhat.com",
"public_ip": "10.3.9.42",
"use_openshift_sdn": true
},
"provider": {
...<snip>...
}
}
},
"changed": false,
"invocation": {
"module_args": "",
"module_name": "openshift_facts"
}
}
}
ok: [10.3.9.36] => {
"result": {
"ansible_facts": {
"openshift": {
"common": {
"hostname": "example-osev3-ansible-bc39a3d3-cdd7-42fe-9c12-9fac9b0ec320.os1.phx2.redhat.com",
"ip": "172.16.4.73",
"public_hostname": "example-osev3-ansible-bc39a3d3-cdd7-42fe-9c12-9fac9b0ec320.os1.phx2.redhat.com",
"public_ip": "10.3.9.36",
"use_openshift_sdn": true
},
"provider": {
...<snip>...
}
}
},
"changed": false,
"invocation": {
"module_args": "",
"module_name": "openshift_facts"
}
}
}
-
Now, we want to verify the detected common settings to verify that they are what we expect them to be (if not, we can override them).
- hostname
- Should resolve to the internal ip from the instances themselves.
openshift_hostname
will override.
- Should resolve to the internal ip from the instances themselves.
- ip
- Should be the internal ip of the instance.
openshift_ip
will override.
- Should be the internal ip of the instance.
- public hostname
- Should resolve to the external ip from hosts outside of the cloud provider.
openshift_public_hostname
will override
- Should resolve to the external ip from hosts outside of the cloud provider.
- public_ip
- Should be the externally accessible ip associated with the instance
openshift_public_ip
will override
- Should be the externally accessible ip associated with the instance
- use_openshift_sdn
- Should be true unless the cloud is GCE.
openshift_use_openshift_sdn
overrides
- Should be true unless the cloud is GCE.
- hostname
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments