Chapter 4. Operational Management
With the successful deployment of OpenShift, the following section demonstrates how to confirm proper functionality of the Red Hat OpenShift Container Platform.
4.1. Validate the Deployment
With the successful deployment of OpenShift, the following section demonstrates how to confirm proper functionality of the OpenShift environment. An Ansible script in the git repository will allow for an application to be deployed which will test the functionality of the master, nodes, registry, and router. The playbook will test the deployment and clean up any projects and pods created during the validation run.
The playbook will perform the following steps:
Environment Validation
-
Validate the public OpenShift
ELBaddress from the installation system -
Validate the public OpenShift
ELBaddress from the master nodes -
Validate the internal OpenShift
ELBaddress from the master nodes - Validate the master local master address
-
Validate the health of the
ETCDcluster to ensure allETCDnodes are healthy - Create a project in OpenShift called validate
- Create an OpenShift Application
- Add a route for the Application
- Validate the URL returns a status code of 200 or healthy
- Delete the validation project
Ensure the URLs below and the tag variables match the variables used during deployment.
$ cd /home/<user>/git/openshift-ansible-contrib/reference-architecture/aws-ansible $ ansible-playbook -i inventory/aws/hosts/ -e 'public_hosted_zone=sysdeseng.com wildcard_zone=apps.sysdeseng.com console_port=443 stack_name=dev' playbooks/validation.yaml
4.2. Gathering hostnames
With all of the steps that occur during the installation of OpenShift it is possible to lose track of the names of the instances in the recently deployed environment. One option to get these hostnames is to browse to the AWS EC2 dashboard and select Running Instances under Resources. Selecting Running Resources shows all instances currently running within EC2. To view only instances specific to the reference architecture deployment filters can be used. Under Instances → Instances within EC2 click beside the magnifying glass. Select a Tag Key such as openshift-role and click All values. The filter shows all instances relating to the reference architecture deployment.
To help facilitate the Operational Management Chapter the following hostnames will be used.
- ose-master01.sysdeseng.com
- ose-master02.sysdeseng.com
- ose-master03.sysdeseng.com
- ose-infra-node01.sysdeseng.com
- ose-infra-node02.sysdeseng.com
- ose-infra-node03.sysdeseng.com
- ose-app-node01.sysdeseng.com
- ose-app-node02.sysdeseng.com
4.3. Running Diagnostics
Perform the following steps from the first master node.
To run diagnostics, SSH into the first master node (ose-master01.sysdeseng.com). Direct access is provided to the first master node because of the configuration of the local ~/.ssh/config file.
$ ssh ec2-user@ose-master01.sysdeseng.com $ sudo -i
Connectivity to the first master node (ose-master01.sysdeseng.com) as the root user should have been established. Run the diagnostics that are included as part of the install.
# oadm diagnostics
... ommitted ...
[Note] Summary of diagnostics execution (version v3.5.5.5):
[Note] Warnings seen: 8The warnings will not cause issues in the environment
Based on the results of the diagnostics, actions can be taken to alleviate any issues.
4.4. Checking the Health of ETCD
This section focuses on the ETCD cluster. It describes the different commands to ensure the cluster is healthy. The internal DNS names of the nodes running ETCD must be used.
SSH into the first master node (ose-master01.sysdeseng.com). Using the output of the command hostname issue the etcdctl command to confirm that the cluster is healthy.
$ ssh ec2-user@ose-master01.sysdeseng.com $ sudo -i
# hostname ip-10-20-1-106.ec2.internal # etcdctl -C https://ip-10-20-1-106.ec2.internal:2379 --ca-file /etc/etcd/ca.crt --cert-file=/etc/origin/master/master.etcd-client.crt --key-file=/etc/origin/master/master.etcd-client.key cluster-health member 82c895b7b0de4330 is healthy: got healthy result from https://10.20.1.106:2379 member c8e7ac98bb93fe8c is healthy: got healthy result from https://10.20.3.74:2379 member f7bbfc4285f239ba is healthy: got healthy result from https://10.20.2.157:2379
In this configuration the ETCD services are distributed among the OpenShift master nodes.
4.5. Default Node Selector
As explained in section 2.12.4 node labels are an important part of the OpenShift environment. By default of the reference architecture installation, the default node selector is set to "role=apps" in /etc/origin/master/master-config.yaml on all of the master nodes. This configuration parameter is set by the OpenShift installation playbooks on all masters and the master API service is restarted that is required when making any changes to the master configuration.
SSH into the first master node (ose-master01.sysdeseng.com) to verify the defaultNodeSelector is defined.
# vi /etc/origin/master/master-config.yaml
...omitted...
projectConfig:
defaultNodeSelector: "role=app"
projectRequestMessage: ""
projectRequestTemplate: ""
...omitted...If making any changes to the master configuration then the master API service must be restarted or the configuration change will not take place. Any changes and the subsequent restart must be done on all masters.
4.6. Management of Maximum Pod Size
Quotas are set on ephemeral volumes within pods to prohibit a pod from becoming too large and impacting the node. There are three places where sizing restrictions should be set. When persistent volume claims are not set a pod has the ability to grow as large as the underlying filesystem will allow. The required modifications are set using a combination of user-data and Ansible.
Openshift Volume Quota
At launch time user-data creates a xfs partition on the /dev/xvdc block device, adds an entry in fstab, and mounts the volume with the option of gquota. If gquota is not set the OpenShift node will not be able to start with the perFSGroup parameter defined below. This disk and configuration is done on the infrastructure and application nodes.
SSH into the first infrastructure node (ose-infra-node01.sysdeseng.com) to verify the entry exists within fstab.
# vi /etc/fstab
/dev/xvdc /var/lib/origin/openshift.local.volumes xfs gquota 0 0Docker Storage Setup
The docker-storage-setup file is created at launch time by user-data. This file tells the Docker service to use /dev/xvdb and create the volume group of docker-vol. The extra Docker storage options ensures that a container can grow no larger than 3G. Docker storage setup is performed on all master, infrastructure, and application nodes.
SSH into the first infrastructure node (ose-infra-node01.sysdeseng.com) to verify /etc/sysconfig/docker-storage-setup matches the information below.
# vi /etc/sysconfig/docker-storage-setup DEVS=/dev/xvdb VG=docker-vol DATA_SIZE=95%VG EXTRA_DOCKER_STORAGE_OPTIONS="--storage-opt dm.basesize=3G"
OpenShift Emptydir Quota
The parameter openshift_node_local_quota_per_fsgroup in the file playbooks/openshift-setup.yaml configures perFSGroup on all nodes. The perFSGroup setting restricts the ephemeral emptyDir volume from growing larger than 512Mi. This empty dir quota is done on the master, infrastructure, and application nodes.
SSH into the first infrastructure node (ose-infra-node01.sysdeseng.com) to verify /etc/origin/node/node-config.yml matches the information below.
# vi /etc/origin/node/node-config.yml
...omitted...
volumeConfig:
localQuota:
perFSGroup: 512Mi4.7. Yum Repositories
In section 2.3 Required Channels the specific repositories for a successful OpenShift installation were defined. All systems except for the bastion host should have the same subscriptions. To verify subscriptions match those defined in Required Channels perform the following. The repositories below are enabled during the rhsm-repos playbook during the installation. The installation will be unsuccessful if the repositories are missing from the system.
# yum repolist
Loaded plugins: amazon-id, rhui-lb, search-disabled-repos, subscription-manager
repo id repo name status
rhel-7-server-extras-rpms/x86_64 Red Hat Enterprise Linux 7 Server - Extras (RPMs) 249
rhel-7-fast-datapath-rpms/7Server/x86_64 Red Hat Enterprise Linux Fast Datapath (RHEL 7 Server) (RPMs) 27
rhel-7-server-ose-3.5-rpms/x86_64 Red Hat OpenShift Container Platform 3.5 (RPMs) 404+10
rhel-7-server-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 Server (RPMs) 11,088
!rhui-REGION-client-config-server-7/x86_64 Red Hat Update Infrastructure 2.0 Client Configuration Server 7 6
!rhui-REGION-rhel-server-releases/7Server/x86_6 Red Hat Enterprise Linux Server 7 (RPMs) 11,088
!rhui-REGION-rhel-server-rh-common/7Server/x86_ Red Hat Enterprise Linux Server 7 RH Common (RPMs) 196
repolist: 23,196All rhui repositories are disabled and only those repositories defined in the Ansible role rhsm-repos are enabled.
4.8. Console Access
This section will cover logging into the OpenShift Container Platform management console via the GUI and the CLI. After logging in via one of these methods applications can then be deployed and managed.
4.8.1. Log into GUI console and deploy an application
Perform the following steps from the local workstation.
Open a browser and access https://openshift-master.sysdeseng.com/console. When logging into the OpenShift web interface the first time the page will redirect and prompt for GitHub credentials. Log into GitHub using an account that is a member of the Organization specified during the install. Next, GitHub will prompt to grant access to authorize the login. If GitHub access is not granted the account will not be able to login to the OpenShift web console.
To deploy an application, click on the New Project button. Provide a Name and click Create. Next, deploy the jenkins-ephemeral instant app by clicking the corresponding box. Accept the defaults and click Create. Instructions along with a URL will be provided for how to access the application on the next screen. Click Continue to Overview and bring up the management page for the application. Click on the link provided and access the application to confirm functionality.
4.8.2. Log into CLI and Deploy an Application
Perform the following steps from your local workstation.
Install the oc client by visiting the public URL of the OpenShift deployment. For example, https://openshift-master.sysdeseng.com/console/command-line and click latest release. When directed to https://access.redhat.com, login with the valid Red Hat customer credentials and download the client relevant to the current workstation. Follow the instructions located on the production documentation site for getting started with the cli.
A token is required to login using GitHub OAuth and OpenShift. The token is presented on the https://openshift-master.sysdeseng.com/console/command-line page. Click the click to show token hyperlink and perform the following on the workstation in which the oc client was installed.
$ oc login https://openshift-master.sysdeseng.com --token=fEAjn7LnZE6v5SOocCSRVmUWGBNIIEKbjD9h-Fv7p09
After the oc client is configured, create a new project and deploy an application.
$ oc new-project test-app $ oc new-app https://github.com/openshift/cakephp-ex.git --name=php --> Found image 2997627 (7 days old) in image stream "php" in project "openshift" under tag "5.6" for "php" Apache 2.4 with PHP 5.6 ----------------------- Platform for building and running PHP 5.6 applications Tags: builder, php, php56, rh-php56 * The source repository appears to match: php * A source build using source code from https://github.com/openshift/cakephp-ex.git will be created * The resulting image will be pushed to image stream "php:latest" * This image will be deployed in deployment config "php" * Port 8080/tcp will be load balanced by service "php" * Other containers can access this service through the hostname "php" --> Creating resources with label app=php ... imagestream "php" created buildconfig "php" created deploymentconfig "php" created service "php" created --> Success Build scheduled, use 'oc logs -f bc/php' to track its progress. Run 'oc status' to view your app. $ oc expose service php route "php" exposed
Display the status of the application.
$ oc status
In project test-app on server https://openshift-master.sysdeseng.com
http://test-app.apps.sysdeseng.com to pod port 8080-tcp (svc/php)
dc/php deploys istag/php:latest <- bc/php builds https://github.com/openshift/cakephp-ex.git with openshift/php:5.6
deployment #1 deployed about a minute ago - 1 pod
Access the application by accessing the URL provided by oc status. The CakePHP application should be visible now.
4.9. Explore the Environment
4.9.1. List Nodes and Set Permissions
If you try to run the following command, it should fail.
# oc get nodes --show-labels
Error from server: User "sysdes-admin" cannot list all nodes in the clusterThe reason it is failing is because the permissions for that user are incorrect. Get the username and configure the permissions.
$ oc whoamiOnce the username has been established, log back into a master node and enable the appropriate permissions for your user. Perform the following step from the first master (ose-master01.sysdeseng.com).
# oadm policy add-cluster-role-to-user cluster-admin sysdesadmin
Attempt to list the nodes again and show the labels.
# oc get nodes --show-labels
NAME STATUS AGE
ip-10-30-1-164.ec2.internal Ready 1d
ip-10-30-1-231.ec2.internal Ready 1d
ip-10-30-1-251.ec2.internal Ready,SchedulingDisabled 1d
ip-10-30-2-142.ec2.internal Ready 1d
ip-10-30-2-157.ec2.internal Ready,SchedulingDisabled 1d
ip-10-30-2-97.ec2.internal Ready 1d
ip-10-30-3-74.ec2.internal Ready,SchedulingDisabled 1d4.9.2. List Router and Registry
List the router and registry by changing to the default project.
If the OpenShift account configured on the workstation has cluster-admin privileges perform the following. If the account does not have this privilege ssh to one of the OpenShift masters and perform the steps.
# oc project default # oc get all NAME REVISION DESIRED CURRENT TRIGGERED BY dc/docker-registry 1 3 3 config dc/router 1 3 3 config NAME DESIRED CURRENT AGE rc/docker-registry-1 3 3 10m rc/router-1 3 3 10m NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/docker-registry 172.30.243.63 <none> 5000/TCP 10m svc/kubernetes 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP 20m svc/router 172.30.224.41 <none> 80/TCP,443/TCP,1936/TCP 10m NAME READY STATUS RESTARTS AGE po/docker-registry-1-2a1ho 1/1 Running 0 8m po/docker-registry-1-krpix 1/1 Running 0 8m po/router-1-1g84e 1/1 Running 0 8m po/router-1-t84cy 1/1 Running 0 8m
Observe the output of oc get all
4.9.3. Explore the Registry
The OpenShift Ansible playbooks configure three infrastructure nodes that have three registries running. In order to understand the configuration and mapping process of the registry pods, the command 'oc describe' is used. oc describe details how registries are configured and mapped to the Amazon S3 buckets for storage. Using oc describe should help explain how HA works in this environment.
If the OpenShift account configured on the workstation has cluster-admin privileges perform the following. If the account does not have this privilege ssh to one of the OpenShift masters and perform the steps.
$ oc describe svc/docker-registry
Name: docker-registry
Namespace: default
Labels: docker-registry=default
Selector: docker-registry=default
Type: ClusterIP
IP: 172.30.110.31
Port: 5000-tcp 5000/TCP
Endpoints: 172.16.4.2:5000,172.16.4.3:5000
Session Affinity: ClientIP
No events.
Notice that the registry has two endpoints listed. Each of those endpoints represents a container. The ClusterIP listed is the actual ingress point for the registries.
The oc client allows similar functionality to the docker command. To find out more information about the registry storage perform the following.
# oc get pods
NAME READY STATUS RESTARTS AGE
docker-registry-2-8b7c6 1/1 Running 0 2h
docker-registry-2-drhgz 1/1 Running 0 2h
docker-registry-2-2s2ca 1/1 Running 0 2h# oc exec docker-registry-2-8b7c6 cat /etc/registry/config.yml
version: 0.1
log:
level: debug
http:
addr: :5000
storage:
cache:
layerinfo: inmemory
s3:
accesskey: "AKIAJZO3LDPPKZFORUQQ"
secretkey: "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"
region: us-east-1
bucket: "1476274760-openshift-docker-registry"
encrypt: true
secure: true
v4auth: true
rootdirectory: /registry
auth:
openshift:
realm: openshift
middleware:
repository:
- name: openshift
Observe the S3 stanza. Confirm the bucket name is listed, and access the AWS console. Click on the S3 AWS and locate the bucket. The bucket should contain content. Confirm that the same bucket is mounted to the other registry via the same steps.
4.9.4. Explore Docker Storage
This section will explore the Docker storage on an infrastructure node.
The example below can be performed on any node but for this example the infrastructure node(ose-infra-node01.sysdeseng.com) is used.
The output below describing the Storage Driver: docker—vol-docker—pool states that docker storage is not using a loop back device.
$ docker info
Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 4
Server Version: 1.10.3
Storage Driver: devicemapper
Pool Name: docker--vol-docker--pool
Pool Blocksize: 524.3 kB
Base Device Size: 3.221 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 1.221 GB
Data Space Total: 25.5 GB
Data Space Available: 24.28 GB
Metadata Space Used: 307.2 kB
Metadata Space Total: 29.36 MB
Metadata Space Available: 29.05 MB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.107-RHEL7 (2016-06-09)
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: bridge null host
Authorization: rhel-push-plugin
Kernel Version: 3.10.0-327.10.1.el7.x86_64
Operating System: Employee SKU
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 2
Total Memory: 7.389 GiB
Name: ip-10-20-3-46.ec2.internal
ID: XDCD:7NAA:N2S5:AMYW:EF33:P2WM:NF5M:XOLN:JHAD:SIHC:IZXP:MOT3
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Registries: registry.access.redhat.com (secure), docker.io (secure)
Verify 3 disks are attached to the instance. The disk /dev/xvda is used for the OS, /dev/xvdb is used for docker storage, and /dev/xvdc is used for emptyDir storage for containers that do not use a persistent volume.
$ fdisk -l
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.
Disk /dev/xvda: 26.8 GB, 26843545600 bytes, 52428800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt
# Start End Size Type Name
1 2048 4095 1M BIOS boot parti
2 4096 52428766 25G Microsoft basic
Disk /dev/xvdc: 53.7 GB, 53687091200 bytes, 104857600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/xvdb: 26.8 GB, 26843545600 bytes, 52428800 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/xvdb1 2048 52428799 26213376 8e Linux LVM
Disk /dev/mapper/docker--vol-docker--pool_tmeta: 29 MB, 29360128 bytes, 57344 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/docker--vol-docker--pool_tdata: 25.5 GB, 25497174016 bytes, 49799168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/docker--vol-docker--pool: 25.5 GB, 25497174016 bytes, 49799168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 131072 bytes / 524288 bytes
Disk /dev/mapper/docker-202:2-75507787-4a813770697f04b1a4e8f5cdaf29ff52073ea66b72a2fbe2546c469b479da9b5: 3221 MB, 3221225472 bytes, 6291456 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 131072 bytes / 524288 bytes
Disk /dev/mapper/docker-202:2-75507787-260bda602f4e740451c428af19bfec870a47270f446ddf7cb427eee52caafdf6: 3221 MB, 3221225472 bytes, 6291456 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 131072 bytes / 524288 bytes4.9.5. Explore Security Groups
As mentioned earlier in the document several security groups have been created. The purpose of this section is to encourage exploration of the security groups that were created.
Perform the following steps from the AWS web console.
On the main AWS console, click on EC2. Next on the left hand navigation panel select the Security Groups. Click through each group and check out both the Inbound and Outbound rules that were created as part of the infrastructure provisioning. For example, notice how the Bastion security group only allows SSH traffic inbound. That can be further restricted to a specific network or host if required. Next take a look at the Master security group and explore all the Inbound and Outbound TCP and UDP rules and the networks from which traffic is allowed.
4.9.6. Explore the AWS Elastic Load Balancers
As mentioned earlier in the document several ELBs have been created. The purpose of this section is to encourage exploration of the ELBs that were created.
Perform the following steps from the AWS web console.
On the main AWS console, click on EC2. Next on the left hand navigation panel select the Load Balancers. Select the ose-master load balancer and on the Description page note the Port Configuration and how it is configured for port 443. That is for the OpenShift web console traffic. On the same tab, check the Availability Zones, note how those are Public subnets. Move to the Instances tab. There should be three master instances running with a Status of InService. Next check the Health Check tab and the options that were configured. Further details of the configuration can be viewed by exploring the Ansible playbooks to see exactly what was configured. Finally, change to the ose-internal-master and compare the subnets. The subnets for the ose-internal-master are all private. They are private because that ELB is reserved for traffic coming from the OpenShift infrastructure to the master servers. This results in reduced charges from Amazon because the packets do not have to be processed by the public facing ELB.
4.9.7. Explore the AWS VPC
As mentioned earlier in the document a Virtual Private Cloud was created. The purpose of this section is to encourage exploration of the VPC that was created.
Perform the following steps from the AWS web console.
On the main Amazon Web Services console, click on VPC. Next on the left hand navigation panel select the Your VPCs. Select the VPC recently created and explore the Summary and Tags tabs. Next, on the left hand navigation panel, explore the Subnets, Route Tables, Internet Gateways, DHCP Options Sets, NAT Gateways, Security Groups and Network ACLs. More detail can be looked at with the configuration by exploring the Ansible playbooks to see exactly what was configured.
4.10. Testing Failure
In this section, reactions to failure are explored. After a successful install and some of the smoke tests noted above have been completed, failure testing is executed.
4.10.1. Generate a Master Outage
Perform the following steps from the AWS web console and the OpenShift public URL.
Log into the AWS console. On the dashboard, click on the EC2 web service and then click Instances. Locate your running ose-master02.sysdeseng.com instance, select it, right click and change the state to stopped.
Ensure the console can still be accessed by opening a browser and accessing openshift-master.sysdeseng.com. At this point, the cluster is in a degraded state because only 2/3 master nodes are running, but complete functionality remains.
4.10.2. Observe the Behavior of ETCD with a Failed Master Node
SSH into the first master node (ose-master01.sysdeseng.com). Using the output of the command hostname issue the etcdctl command to confirm that the cluster is healthy.
$ ssh ec2-user@ose-master01.sysdeseng.com $ sudo -i
# hostname ip-10-20-1-106.ec2.internal # etcdctl -C https://ip-10-20-1-106.ec2.internal:2379 --ca-file /etc/etcd/ca.crt --cert-file=/etc/origin/master/master.etcd-client.crt --key-file=/etc/origin/master/master.etcd-client.key cluster-health failed to check the health of member 82c895b7b0de4330 on https://10.20.2.251:2379: Get https://10.20.1.251:2379/health: dial tcp 10.20.1.251:2379: i/o timeout member 82c895b7b0de4330 is unreachable: [https://10.20.1.251:2379] are all unreachable member c8e7ac98bb93fe8c is healthy: got healthy result from https://10.20.3.74:2379 member f7bbfc4285f239ba is healthy: got healthy result from https://10.20.1.106:2379 cluster is healthy
Notice how one member of the ETCD cluster is now unreachable. Restart ose-master02.sysdeseng.com by following the same steps in the AWS web console as noted above.
4.10.3. Generate an Infrastructure Node outage
This section shows what to expect when an infrastructure node fails or is brought down intentionally.
4.10.3.1. Confirm Application Accessibility
Perform the following steps from the browser on a local workstation.
Before bringing down an infrastructure node, check behavior and ensure things are working as expected. The goal of testing an infrastructure node outage is to see how the OpenShift routers and registries behave. Confirm the simple application deployed from before is still functional. If it is not, deploy a new version. Access the application to confirm connectivity. As a reminder, to find the required information the ensure the application is still running, list the projects, change to the project that the application is deployed in, get the status of the application which including the URL and access the application via that URL.
$ oc get projects NAME DISPLAY NAME STATUS openshift Active openshift-infra Active ttester Active test-app1 Active default Active management-infra Active $ oc project test-app1 Now using project "test-app1" on server "https://openshift-master.sysdeseng.com". $ oc status In project test-app1 on server https://openshift-master.sysdeseng.com http://php-test-app1.apps.sysdeseng.com to pod port 8080-tcp (svc/php-prod) dc/php-prod deploys istag/php-prod:latest <- bc/php-prod builds https://github.com/openshift/cakephp-ex.git with openshift/php:5.6 deployment #1 deployed 27 minutes ago - 1 pod
Open a browser and ensure the application is still accessible.
4.10.3.2. Confirm Registry Functionality
This section is another step to take before initiating the outage of the infrastructure node to ensure that the registry is functioning properly. The goal is to push to the OpenShift registry.
Perform the following steps from CLI on a local workstation and ensure that the oc client has been configured.
A token is needed so that the registry can be logged into.
# oc whoami -t
feAeAgL139uFFF_72bcJlboTv7gi_bo373kf1byaAT8Pull a new docker image for the purposes of test pushing.
# docker pull fedora/apache # docker images
Capture the registry endpoint. The svc/docker-registry shows the endpoint.
# oc status
In project default on server https://internal-openshift-master.sysdeseng.com:443
https://docker-registry-default.apps.sysdeseng.com (passthrough) (svc/docker-registry)
dc/docker-registry deploys docker.io/openshift3/ose-docker-registry:v3.5.5.5
deployment #1 deployed 44 minutes ago - 3 pods
svc/kubernetes - 172.30.0.1 ports 443, 53->8053, 53->8053
https://registry-console-default.apps.sysdeseng.com (passthrough) (svc/registry-console)
dc/registry-console deploys registry.access.redhat.com/openshift3/registry-console:3.5
deployment #1 deployed 43 minutes ago - 1 pod
svc/router - 172.30.41.42 ports 80, 443, 1936
dc/router deploys docker.io/openshift3/ose-haproxy-router:v3.5.5.5
deployment #1 deployed 45 minutes ago - 3 pods
View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.Tag the docker image with the endpoint from the previous step.
# docker tag docker.io/fedora/apache 172.30.110.31:5000/openshift/prodapacheCheck the images and ensure the newly tagged image is available.
# docker images
Issue a Docker login.
# docker login -u sysdesadmin -e sysdesadmin -p $(oc whoami -t) 172.30.110.31:5000# oadm policy add-role-to-user admin sysdesadmin -n openshift # oadm policy add-role-to-user system:registry sysdesadmin # oadm policy add-role-to-user system:image-builder sysdesadmin
Push the image to the OpenShift registry now.
# docker push 172.30.110.222:5000/openshift/prodapache
The push refers to a repository [172.30.110.222:5000/openshift/prodapache]
389eb3601e55: Layer already exists
c56d9d429ea9: Layer already exists
2a6c028a91ff: Layer already exists
11284f349477: Layer already exists
6c992a0e818a: Layer already exists
latest: digest: sha256:ca66f8321243cce9c5dbab48dc79b7c31cf0e1d7e94984de61d37dfdac4e381f size: 6186
4.10.3.3. Get Location of Router and Registry.
Perform the following steps from the CLI of a local workstation.
Change to the default OpenShift project and check the router and registry pod locations.
$ oc project default Now using project "default" on server "https://openshift-master.sysdeseng.com". $ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE docker-registry-2-gmvdr 1/1 Running 1 21h 172.16.4.2 ip-10-30-1-17.ec2.internal docker-registry-2-jueep 1/1 Running 0 7h 172.16.3.3 ip-10-30-2-208.ec2.internal router-1-6y5td 1/1 Running 1 21h 172.16.4.4 ip-10-30-1-17.ec2.internal router-1-rlcwj 1/1 Running 1 21h 172.16.3.5 ip-10-30-2-208.ec2.internal
4.10.3.4. Initiate the Failure and Confirm Functionality
Perform the following steps from the AWS web console and a browser.
Log into the AWS console. On the dashboard, click on the EC2 web service. Locate your running infra01 instance, select it, right click and change the state to stopped. Wait a minute or two for the registry and pod to migrate over to infra01. Check the registry locations and confirm that they are on the same node.
NAME READY STATUS RESTARTS AGE IP NODE docker-registry-2-gmvdr 1/1 Running 1 21h 172.16.3.6 ip-10-30-2-208.ec2.internal docker-registry-2-jueep 1/1 Running 0 7h 172.16.3.3 ip-10-30-2-208.ec2.internal router-1-6y5td 1/1 Running 1 21h 172.16.3.7 ip-10-30-2-208.ec2.internal router-1-rlcwj 1/1 Running 1 21h 172.16.3.5 ip-10-30-2-208.ec2.internal
Follow the procedures above to ensure an image can still be pushed to the registry now that infra01 is down.
4.11. Updating the OpenShift Deployment
Playbooks are provided to upgrade the OpenShift deployment when minor releases occur.
4.11.1. Performing the Upgrade
From the workstation that was used to perform the installation of OpenShift on AWS run the following to ensure that the newest openshift-ansible playbooks and roles are available and to perform the minor upgrade against the deployed environment.
Ensure the variables below are relevant to the deployed OpenShift environment. The variables that should be customerized for the deployed OpenShift environment are stack_name, public_hosted_zone, console_port, region, and containerized.
4.11.1.1. Non-Containerized Upgrade
Use the following lines below to perform the upgrade in a non-containerized environment.
$ yum update atomic-openshift-utils ansible $ cd ~/git/openshift-ansible-contrib/reference-architecture/aws-ansible $ ansible-playbook -i inventory/aws/hosts -e 'stack_name=openshift-infra public_hosted_zone=sysdeseng.com console_port=443 region=us-east-1' playbooks/openshift-minor-upgrade.yaml
4.11.1.2. Containerized Upgrade
Use the following lines below to perform the upgrade in a containerized environment.
$ yum update atomic-openshift-utils ansible $ cd ~/git/openshift-ansible-contrib/reference-architecture/aws-ansible $ ansible-playbook -i inventory/aws/hosts -e 'stack_name=openshift-infra public_hosted_zone=sysdeseng.com console_port=443 region=us-east-1 containerized=true' playbooks/openshift-minor-upgrade.yaml
4.11.2. Upgrading and Restarting the OpenShift Environment (Opitonal)
The openshift-minor-update.yaml playbook will not restart the instances after updating occurs. Restarting the nodes including the masters can be completed by adding the following line to the minor-update.yaml playbook.
$ cd ~/git/openshift-ansible-contrib/playbooks $ vi minor-update.yaml openshift_rolling_restart_mode: system
4.11.3. Specifying the OpenShift Version when Upgrading
The deployed OpenShift environment may not be the latest major version of OpenShift. The minor-update.yaml allows for a variable to be passed to perform an upgrade on previous versions. Below is an example of performing the upgrade on a 3.3 non-containerized environment.
$ yum update atomic-openshift-utils ansible $ cd ~/git/openshift-ansible-contrib/reference-architecture/aws-ansible $ ansible-playbook -i inventory/aws/hosts -e 'stack_name=openshift-infra public_hosted_zone=sysdeseng.com console_port=443 region=us-east-1 openshift_vers=v3_4' playbooks/openshift-minor-upgrade.yaml

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.