Trouble installing OpenShift v3 on AWS and Openstack

Solution Verified - Updated -

Environment

  • RedHat Enterprise OpenShift 3.0.X

Issue

  • Ansible install is failing when installing on AWS or OpenStack
TASK: [openshift_manage_node | Wait for Node Registration] ******************** 
2015-09-18 18:58:29,493 p=2233 u=root |  failed: [ose-master.example.com] => (item=ose-node2.example.com) => {"attempts": 10, "changed": true, "cmd": ["oc", "get", "node", "ose-node2.example.com"], "delta": "0:00:00.361069", "end": "2015-09-18 18:58:29.481295", "failed": true, "item": "ose-node2.example.com", "rc": 1, "start": "2015-09-18 18:58:29.120226", "warnings": []}
  • Getting error after ansible install on openshift-node.service
Sep 20 16:07:53 ose-node2 openshift-node: I0920 16:07:53.671657    5339 kubelet.go:793] Unable to register ose-node2.example.com with the apiserver: Post https://ose-master.example.com:8443/api/v1/nodes: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:07:54 ose-node2 openshift-node: E0920 16:07:54.683793    5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
Sep 20 16:07:58 ose-node2 openshift-node: E0920 16:07:58.969617    5339 common.go:243] Could not find an allocated subnet for minion ose-node2.example.com: Get https://ose-master.example.com:8443/oapi/v1/hostsubnets/ose-node2.example.com: dial tcp 20.0.0.27:8443: i/o timeout. Waiting...
Sep 20 16:08:00 ose-node2 openshift-node: I0920 16:08:00.672089    5339 kubelet.go:1942] Recording NodeReady event message for node ose-node2.example.com
Sep 20 16:08:00 ose-node2 openshift-node: I0920 16:08:00.672145    5339 kubelet.go:790] Attempting to register node ose-node2.example.com
Sep 20 16:08:02 ose-node2 openshift-node: E0920 16:08:02.242390    5339 reflector.go:180] pkg/kubelet/kubelet.go:177: Failed to list *api.Service: Get https://ose-master.example.com:8443/api/v1/services: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:08:02 ose-node2 openshift-node: E0920 16:08:02.242397    5339 reflector.go:180] pkg/kubelet/kubelet.go:194: Failed to list *api.Node: Get https://ose-master.example.com:8443/api/v1/nodes?fieldSelector=metadata.name%3Dose-node2.example.com: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:08:02 ose-node2 openshift-node: E0920 16:08:02.242472    5339 reflector.go:180] pkg/kubelet/config/apiserver.go:43: Failed to list *api.Pod: Get https://ose-master.example.com:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dose-node2.example.com: dial tcp 20.0.0.27:8443: i/o timeout
Sep 20 16:08:04 ose-node2 openshift-node: E0920 16:08:04.685062    5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
Sep 20 16:08:14 ose-node2 openshift-node: E0920 16:08:14.686336    5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
Sep 20 16:08:22 ose-node2 openshift-node: E0920 16:08:22.049729    5339 event.go:194] Unable to write event: 'Post https://ose-master.example.com:8443/api/v1/namespaces/default/events: dial tcp 20.0.0.27:8443: i/o timeout' (may retry after sleeping)
Sep 20 16:08:24 ose-node2 openshift-node: E0920 16:08:24.687616    5339 kubelet.go:1641] error getting node: node ose-node2.example.com not found
  • During the install Ansible, the hostname name changes to an unwanted value

  • Ansible install is using the wrong hostname and IP addresses

Resolution

  • When installing OpenShift v3 on AWS or OpenStack make sure security groups are configured to allow access from the outside to the following.

    • TCP/22 - ssh
    • TCP/80 - Web Apps
    • TCP/443 - Web Apps (https)
    • UDP/4789 - SDN / VXLAN
    • TCP/8443 - Openshift Console
    • TCP/10250 - kubelet
  • Hostnames and IP addresses should be defined for master and nodes in the /etc/ansible/host file.

    • openshift_ip
    • openshift_public_ip
    • openshift_hostname
    • openshift_public_hostname

Example /etc/ansible/hosts file

# Ansible 'hosts' file for basic 1 master 2 node install
[OSEv3:children]
masters
nodes

[OSEv3:vars]
ansible_ssh_user=root
#ansible_become=true
deployment_type=enterprise
product_type=openshift

[masters]
ose-master.example.com  openshift_ip=16.0.0.240 openshift_public_ip=54.175.210.46 openshift_hostname=ose-master.example.com openshift_public_hostname=ose-master.example.com

[nodes]
ose-node2.example.com         openshift_ip=16.0.0.120 openshift_public_ip=54.210.35.7 openshift_hostname=ose-node2.example.com openshift_public_hostname=ose-node2.example.com
ose-node1.example.com         openshift_ip=16.0.0.179  openshift_public_ip=54.173.246.71 openshift_hostname=ose-node1.example.com openshift_public_hostname=ose-node1.example.com
ose-master.example.com        openshift_ip=16.0.0.240 openshift_public_ip=54.175.210.46 openshift_hostname=ose-master.example.com openshift_public_hostname=ose-master.example.com openshift_scheduleable=False
  • On each host disable cloud-init (optional)
    # systemctl stop cloud-init-local.service cloud-init.service cloud-final.service cloud-config.service
    # systemctl disable cloud-init-local.service cloud-init.service cloud-final.service cloud-config.service
    # systemctl mask cloud-init-local.service cloud-init.service cloud-final.service cloud-config.service
  • If failures still occur log a case with Red Hat Support and provide debug logs from your ansible playbook:
# ansible-playbook ~/openshift-ansible/playbooks/byo/config.yml -vvv | tee ansible.logs 

Note

Starting in 3.10, openshift_ip and openshift_hostname are deprecated. It is important to label the node in the hosts file with the same name used in the cloud provider's portal, and as visible in the hostname command run on the host. See this related solution for more.

Diagnostic Steps

Overriding detected ip addresses and hostnames

  • Some deployments will require that the user override the detected hostnames and ip addresses for the hosts. To see what the default values will be you can run the openshift_facts playbook:
# ansible-playbook playbooks/byo/openshift_facts.yml

The output will be similar to:

ok: [10.3.9.45] => {
    "result": {
        "ansible_facts": {
            "openshift": {
                "common": {
                    "hostname": "example-osev3-ansible-005dcfa6-27c6-463d-9b95-ef059579befd.os1.phx2.redhat.com",
                    "ip": "172.16.4.79",
                    "public_hostname": "example-osev3-ansible-005dcfa6-27c6-463d-9b95-ef059579befd.os1.phx2.redhat.com",
                    "public_ip": "10.3.9.45",
                    "use_openshift_sdn": true
                },
                "provider": {
                  ... <snip> ...
                }
            }
        },
        "changed": false,
        "invocation": {
            "module_args": "",
            "module_name": "openshift_facts"
        }
    }
}
ok: [10.3.9.42] => {
    "result": {
        "ansible_facts": {
            "openshift": {
                "common": {
                    "hostname": "example-osev3-ansible-c6ae8cdc-ba0b-4a81-bb37-14549893f9d3.os1.phx2.redhat.com",
                    "ip": "172.16.4.75",
                    "public_hostname": "example-osev3-ansible-c6ae8cdc-ba0b-4a81-bb37-14549893f9d3.os1.phx2.redhat.com",
                    "public_ip": "10.3.9.42",
                    "use_openshift_sdn": true
                },
                "provider": {
                  ...<snip>...
                }
            }
        },
        "changed": false,
        "invocation": {
            "module_args": "",
            "module_name": "openshift_facts"
        }
    }
}
ok: [10.3.9.36] => {
    "result": {
        "ansible_facts": {
            "openshift": {
                "common": {
                    "hostname": "example-osev3-ansible-bc39a3d3-cdd7-42fe-9c12-9fac9b0ec320.os1.phx2.redhat.com",
                    "ip": "172.16.4.73",
                    "public_hostname": "example-osev3-ansible-bc39a3d3-cdd7-42fe-9c12-9fac9b0ec320.os1.phx2.redhat.com",
                    "public_ip": "10.3.9.36",
                    "use_openshift_sdn": true
                },
                "provider": {
                    ...<snip>...
                }
            }
        },
        "changed": false,
        "invocation": {
            "module_args": "",
            "module_name": "openshift_facts"
        }
    }
}
  • Now, we want to verify the detected common settings to verify that they are what we expect them to be (if not, we can override them).

    • hostname
      • Should resolve to the internal ip from the instances themselves.
        openshift_hostname will override.
    • ip
      • Should be the internal ip of the instance.
        openshift_ip will override.
    • public hostname
      • Should resolve to the external ip from hosts outside of the cloud provider.
        openshift_public_hostname will override
    • public_ip
      • Should be the externally accessible ip associated with the instance
        openshift_public_ip will override
    • use_openshift_sdn
      • Should be true unless the cloud is GCE.
        openshift_use_openshift_sdn overrides

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments