Menu Close

Chapter 9. Installation Failure

In the event of an OpenShift Container Platform installation failure use the following sections to diagnose and find the source of the problem. Note that the resource group can be deleted, reinstalled by running the script again.

9.1. Diagnostic and Control of OpenShift Container Platform on Microsoft Azure

The OpenShift Container Platform installation can be controlled from the bastion host. This is a separate virtual machine that allows access to all VM’s in the same resource group that defines the OpenShift Container Platform installation on Microsoft Azure.

As an example, assuming the resource group was named during creation to ocpxenon1000, with a username of ocpadmin on the westus region:

$ ssh ocpadmin@ocpxenon1000b.westus.cloudapp.azure.com
Last login: Sat Jan 21 04:32:47 2017 from 103.252.201.32
[ocpadmin@bastion ~]$

9.2. Logging of Installation

The automation collects logs for various stages of the installation. All the logs are stored on the bastion host. Assuming ocpadmin has been chosen as the admin user when creating the install, there are some useful logs in the home directory of the user (/home/ocpadmin):

Table 9.1. Installation logs

File name

Content

ansible-preinstall-ping.out

Check connectitity of all hosts

openshift-install.out

Main OpenShift Container Platform installation

9.3. Inventory

The inventory for ansible is automatically generated by the bastion.sh script at the first boot of the bastion host, stored in the bastion itself at the default location (/etc/ansible/hosts) and can be used to run update scripts. In order to run updates, or to diagnose failures, it is necessary to ssh to the bastion host on Microsoft Azure.

$ sudo cat /etc/ansible/hosts
[OSEv3:children]
masters
etcd
nodes
misc

[OSEv3:vars]
azure_resource_group=ocpxenon1000
rhn_pool_id=8a85f98156724eaa0156728452003452
openshift_install_examples=true
deployment_type=openshift-enterprise
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]

# default selectors for router and registry services
openshift_router_selector='region=infra'
openshift_registry_selector='region=infra'

ansible_become=yes
ansible_ssh_user=ocpadmin
remote_user=ocpadmin

openshift_master_default_subdomain=52.163.224.147.xip.io
openshift_use_dnsmasq=False
openshift_public_hostname=ocpxenon1000.westus.cloudapp.azure.com

openshift_master_cluster_method=native
openshift_master_cluster_hostname=ocpxenon1000.westus.cloudapp.azure.com
openshift_master_cluster_public_hostname=ocpxenon1000.westus.cloudapp.azure.com

# Enable cockpit
osm_use_cockpit=true

# Set cockpit plugins
osm_cockpit_plugins=['cockpit-kubernetes']

# default storage plugin dependencies to install, by default the ceph and
# glusterfs plugin dependencies will be installed, if available.
osn_storage_plugin_deps=['Azure VHD']

[masters]
master1 openshift_hostname=master1 openshift_node_labels="{'role': 'master'}"
master2 openshift_hostname=master2 openshift_node_labels="{'role': 'master'}"
master3 openshift_hostname=master3 openshift_node_labels="{'role': 'master'}"

[etcd]
master1
master2
master3

[nodes]
master1 openshift_node_labels="{'region':'master','zone':'default'}" openshift_schedulable=false
master2 openshift_node_labels="{'region':'master','zone':'default'}" openshift_schedulable=false
master3 openshift_node_labels="{'region':'master','zone':'default'}" openshift_schedulable=false
node[01:03] openshift_node_labels="{'role': 'app', 'zone': 'default'}"
infranode1 openshift_hostname=infranode1 openshift_node_labels="{'role': 'infra', 'zone': 'default'}"
infranode2 openshift_hostname=infranode2 openshift_node_labels="{'role': 'infra', 'zone': 'default'}"
infranode3 openshift_hostname=infranode3 openshift_node_labels="{'role': 'infra', 'zone': 'default'}"

9.4. Uninstalling and Deleting

The uninstall playbook removes OpenShift Container Platform related packages, etcd, and removes any certificates that were created during the failed install. In case you need to do it, run the following from the bastion host:

$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml

After the playbook, the administrator should unsubscribe each host, to return the subscription back into the available pool, and then delete the resource group within Microsoft Azure portal, which will delete all resources.

9.5. Manually Launching the Installation of OpenShift

The openshift-install.sh script, located in the bastion host at /home/user/openshift-install.sh, can be used to automatically install OpenShift Container Platform. The script can be re-run to diagnose problems.

$ ./openshift-install.sh

9.6. Gmail notification

The bastion.sh script can optionally notify the user via email during the installation about the steps that has been done. It creates a /root/setup_ssmtp.sh script with the username and password provided in the ARM template that will configure an ssmtp MTA service, and if the GMail account exists, it will notify the user periodically on the steps finished.