Chapter 2. Components and Configuration

This chapter describes the reference architecture environment that is deployed providing a highly available OpenShift Container Platform environment on Microsoft Azure.

The image below provides a high-level representation of the components within this reference architecture. By using Microsoft Azure, resources are highly available using a combination of VM placement using Azure Availability Sets, Azure Load Balancer (ALB), and Azure VHD persistent volumes. Instances deployed are given specific roles to support OpenShift Container Platform:

  • The bastion host limits the external access to internal servers by ensuring that all SSH traffic passes through the bastion host.
  • The master instances host the OpenShift Container Platform master components such as etcd and the OpenShift Container Platform API.
  • The application node instances are for users to deploy their containers.
  • Infrastructure node instances are used for the OpenShift Container Platform infrastructure elements like the OpenShift Container Platform router and OpenShift Container Platform integrated registry.

The authentication is managed by the htpasswd identity provider but OpenShift Container Platform can be configured to use any of the supported identity providers (including GitHub, Google or LDAP). OpenShift Container Platform on Microsoft Azure uses a combination of premium and standard storage, which is used for the filesystem of the instances and for persistent storage in containers.

The network is configured to leverage two Azure Load Balancers:

  • External load balancer gives access to the OpenShift Container Platform web console and API from outside the cluster
  • Router load balancer for application access from outside the cluster

The OpenShift Container Platform web console and API can be accessed directly via the automatically created DNS entry while the application access is accessed using the nip.io service that provides a wildcard DNS A record to forward traffic to the Router load balancer.

Note

See Microsoft Azure DNS section for more information about the DNS configuration

OSE on Azure

This reference architecture breaks down the deployment into three separate phases.

  • Phase 1: Provision the Virtual Machines on Microsoft Azure
  • Phase 2: Install OpenShift Container Platform on Microsoft Azure
  • Phase 3: Post deployment activities

For Phase 1, the provisioning of the environment is done using a series of Azure Resource Manager templates (ARM) provided in the openshift-ansible-contrib git repository. Once the infrastructure is deployed by ARM, as the last action, the ARM templates will start the next phase by running a bash script that starts phase 2.

Phase 2 is the provisioning of OpenShift Container Platform using the ansible playbooks installed by the openshift-ansible-playbooks RPM package. This is driven by a set of bash scripts that setup the inventory, setup parameters, and make sure all the needed playbooks are coordinated. As the last part of phase 2, the router and registry are deployed.

The last phase, Phase 3, concludes the deployment, which is done manually. This consists of optionally configure a custom DNS entry to point to the application load balancers (to avoid the default nip.io domain) and by manually verifying the configuration. This is done by running tools like oadm diagnostics and the systems engineering teams validation ansible playbook.

Note

The scripts provided in the GitHub repository are not supported by Red Hat. They merely provide a mechanism that can be used to build out an OpenShift Container Platform environment.

2.1. Microsoft Azure Cloud Instance Details

Within this reference environment, the instances are deployed in a single Azure Region which can be selected when running the ARM template. Although the default region can be changed, the reference architecture deployment should only be used in regions with premium storage for performance reasons.

All VMs are created using the On-Demand Red Hat Enterprise Linux (RHEL) image and the size used by default is Standard_DS4_v2 for masters and nodes and Standard_DS1_v2 for the bastion host. Instance sizing can be changed when the ARM template is run which is covered in later chapters.

Note

For higher availability, multiple clusters should be created, and federation should be used. This architecture is emerging and will be described in future reference architecture.

2.1.1. Microsoft Azure Cloud Instance Storage Details

Linux VMs in Microsoft Azure are created by default with two virtual disks attached, where the first one is the operating system disk and the second one is a temporary disk where the data persistence is not guaranteed and it is used by default to store a swap file created by the Azure Linux Agent.

As a best practice, instances deployed to run containers in OpenShift Container Platform include a dedicated disk (datadisk) configured to store the container images as well as a dedicated disk configured to store the emptyDir volumes. The disk setup is provided in the ARM template of each virtual machine type like master.json in the git repository for OpenShift Container Platform master instances.

Data disks can be created up to 1023 GB where operating system disk and temporary disk size depend on the size of the virtual machine, where the default Standard_DS4_v2 used in this reference architecture for masters and nodes is:

Table 2.1. Instance Storage details for masters and nodes by default

TypeNameMountpointSizePurpose

operating system disk

sda

/boot & /

32GB

Root filesystem

temporary disk

sdb

/mnt/resource

128 GB

Temporary storage

data disk

sdc

/var/lib/origin/openshift.local.volumes

128 GB

OpenShift Container Platform emptyDir volumes

data disk

sdd

none

128 GB

Docker images storage

The following is a sample output in a OpenShift Container Platform master virtual machine deployed using this reference architecture where the mountpoints as well as the disks can be seen as described:

$ lsblk
NAME                              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
fd0                                 2:0    1    4K  0 disk
sda                                 8:0    0   32G  0 disk
├─sda1                              8:1    0  500M  0 part /boot
└─sda2                              8:2    0 31,5G  0 part /
sdb                                 8:16   0   28G  0 disk
└─sdb1                              8:17   0   28G  0 part /mnt/resource
sdc                                 8:32   0  128G  0 disk
└─sdc1                              8:33   0  128G  0 part /var/lib/origin/openshift.local.volumes
sdd                                 8:48   0  128G  0 disk
└─sdd1                              8:49   0  128G  0 part
  ├─docker--vg-docker--pool_tmeta 253:0    0  132M  0 lvm
  │ └─docker--vg-docker--pool     253:2    0   51G  0 lvm
  └─docker--vg-docker--pool_tdata 253:1    0   51G  0 lvm
    └─docker--vg-docker--pool     253:2    0   51G  0 lvm
sr0                                11:0    1  1,1M  0 rom
Tip

Swap is disabled automatically in the installation with the git repository scripts in nodes where pods will run as a best practice

Note

For more detail about the emptyDir and container image storage, see the Management of Maximum Pod Size section

The bastion host only has the default operating system disk and temporary disk, where in the Standard_DS1_v2 virtual machine size are:

Table 2.2. Instance Storage details for bastion by default

TypeNameMountpointSizePurpose

operating system disk

sda

/boot & /

32GB

Root filesystem

temporary disk

sdb

/mnt/resource

128 GB

Temporary storage

All the disks created by this reference architecture for the virtual machines use the Azure Premium Disk to performance reasons (high throughput and IOPS).

Note

For more information, see about disks and VHDs for Azure Linux VMs

2.2. Microsoft Azure Load Balancer Details

Two Azure Load Balancers (ALB) are used in this reference environment. The table below describes the ALB, the load balancer DNS name, the instances in which the Azure Load Balancers (ALB) is attached, and the port monitored by the load balancer to state whether an instance is in or out of service.

Table 2.3. Microsoft Azure Load Balancer

ALBDNS nameAssigned InstancesPort

External load balancer

<resourcegroupname>.<region>.cloudapp.azure.com

master1-3

8443

Router load balancer

<wildcardzone>.<region>.cloudapp.azure.com

infra-nodes1-3

80 and 443

The External load balancer utilizes the OpenShift Container Platform master API port for communication internally and externally. The Router load balancer uses the public subnets and maps to infrastructure nodes. The infrastructure nodes run the router pod which then directs traffic directly from the outside world into pods when external routes are defined.

To avoid reconfiguring DNS every time a new route is created, an external wildcard A DNS entry record must be configured pointing to the Router load balancer IP.

For example, create a wildcard DNS entry for cloudapps.example.com that has a low time-to-live value (TTL) and points to the public IP address of the Router load balancer:

*.cloudapps.example.com. 300 IN A 192.168.133.2

2.3. Software Version Details

The following tables provide the installed software versions for the different servers that make up the Red Hat OpenShift Container Platform highly available reference environment.

Table 2.4. RHEL OSEv3 Details

SoftwareVersion

Red Hat Enterprise Linux 7.3 x86_64

kernel-3.10.0-327

Atomic-OpenShift{master/clients/node/sdn-ovs/utils}

3.5

Docker

1.12.x

Ansible

2.2.1

2.4. Required Channels

A subscription to the following channels is required in order to deploy this reference environment’s configuration.

Table 2.5. Required Channels - OSEv3 Master and Node Instances

ChannelRepository Name

Red Hat Enterprise Linux 7 Server (RPMs)

rhel-7-server-rpms

Red Hat OpenShift Enterprise 3.5 (RPMs)

rhel-7-server-ose-3.5-rpms

Red Hat Enterprise Linux 7 Server - Extras (RPMs)

rhel-7-server-extras-rpms

Red Hat Enterprise Linux 7 Server - Fast Datapath (RPMs)

rhel-7-fast-datapath-rpms

The subscriptions are accessed via a pool id, which is a required parameter in the ARM template that will deploy the VMs in the Microsoft Azure environment and it is located in the reference-architecture/azure-ansible/azuredeploy.parameters.json file in the openshift-ansible-contrib repository

Note

The pool id can be obtained in the Subscriptions section of the Red Hat Customer Portal, by selecting the appropriate subscription that will open a detailed view of the subscription, including the Pool ID

2.5. Prerequisites

This section describes the environment and setup needed to execute the ARM template and perform post installation tasks.

2.5.1. GitHub Repositories

The code in the openshift-ansible-contrib repository referenced below handles the installation of OpenShift Container Platform and the accompanying infrastructure. The openshift-ansible-contrib repository is not explicitly supported by Red Hat but the Reference Architecture team performs testing to ensure the code operates as defined and is secure.

https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/azure-ansible

For this reference architecture, the scripts are accessed and used directly from GitHub. There is no requirement to download the code, as it’s done automatically once the script is started.

2.6. Microsoft Azure Subscription

In order to deploy the environment from the template, a Microsoft Azure subscription is required. A trial subscription is not recommended, as the reference architecture uses significant resources, and the typical trial subscription does not provide adequate resources.

The deployment of OpenShift Container Platform requires a user that has the proper permissions by the Microsoft Azure administrator. The user must be able to create accounts, storage accounts, roles, policies, load balancers, and deploy virtual machine instances. It is helpful to have delete permissions in order to be able to redeploy the environment while testing.

2.7. Microsoft Azure Region Selection

An OpenShift Container Platform cluster is deployed with-in one Azure Region. In order to get the best possible availability in Microsoft Azure, availability sets are implemented.

In Microsoft Azure, virtual machines (VMs) can be placed in to a logical grouping called an availability set. When creating VMs within an availability set, the Microsoft Azure platform distributes the placement of those VMs across the underlying infrastructure. Should there be a planned maintenance event to the Microsoft Azure platform or an underlying hardware/infrastructure fault, the use of availability sets ensures that at least one VM remains running. The Microsoft Azure SLA requires two or more VMs within an availability set to allow the distribution of VMs across the underlying infrastructure.

2.8. SSH Public and Private Key

SSH keys are used instead of passwords in the OpenShift Container Platform installation process. These keys are generated on the system that will be used to login and manage the system. In addition, they are automatically distributed by the ARM template to all virtual machines that are created.

In order to use the template, SSH public and private keys are needed. To avoid asking for the passphrase, do not not apply a passphrase to the key.

The public key will be injected in the ~/.ssh/authorized_keys file in all the hosts, and the private key will be copied to the ~/.ssh/id_rsa file in all the hosts to allow SSH communication within the environment (i.e.- from the bastion to master1 without passwords).

2.8.1. SSH Key Generation

If SSH keys do not currently exist then it is required to create them. Generate an RSA key pair by typing the following at a shell prompt:

$ ssh-keygen -t rsa -N '' -f /home/USER/.ssh/id_rsa

A message similar to this will be presented indicating they key has been successful created

Your identification has been saved in /home/USER/.ssh/id_rsa.
Your public key has been saved in /home/USER/.ssh/id_rsa.pub.
The key fingerprint is:
e7:97:c7:e2:0e:f9:0e:fc:c4:d7:cb:e5:31:11:92:14 USER@sysdeseng.rdu.redhat.com
The key's randomart image is:
+--[ RSA 2048]----+
|             E.  |
|            . .  |
|             o . |
|              . .|
|        S .    . |
|         + o o ..|
|          * * +oo|
|           O +..=|
|           o*  o.|
+-----------------+

2.9. Resource Groups and Resource Group Name

In the Microsoft Azure environment, resources such as storage accounts, virtual networks and virtual machines (VMs) are grouped together in resource groups as a single entity and their names must be unique to an Microsoft Azure subscription. Note that multiple resource groups are supported in a region, as well as having the same resource group in multiple regions but a resource group may not span resources in multiple regions.

Note

For more information about Microsoft Azure Resource Groups, check the Azure Resource Manager overview documentation

2.10. Microsoft Azure Virtual Network (VNet)

An Azure VNet provides the ability to set up custom virtual networking which includes subnets, and IP address ranges. In this reference implementation guide, a dedicated VNet is created with all its accompanying services to provide a stable network for the OpenShift Container Platform deployment.

A VNet is created as a logical representation of a networking environment in the Microsoft Azure cloud. The following subnets and CIDR listed below are used.

Important

Substitute the values if needed to ensure no conflict with an existing CIDR or subnet in the environment. The values are defined in the template https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/azure-ansible/azuredeploy.json

Table 2.6. VNet Networking

CIDR/SubnetValues

CIDR

10.0.0.0/16

Master Subnet

10.0.0.0/24

Node Subnet

10.0.1.0/24

Infra Subnet

10.0.2.0/24

The VNet is created and a human readable tag is assigned. Three subnets are created in the VNet. The bastion instance is on the Master Subnet. The two internal load balancers allow access to the OpenShift Container Platform API and console and the routing of application traffic. All the VMs are able to communicate to the internet for packages, container images, and external git repositories.

Note

For more information see Azure Virtual Networks documentation

2.11. OpenShift SDN

OpenShift Container Platform uses a software-defined networking (SDN) approach to provide a unified cluster network that enables communication between pods across the OpenShift Container Platform cluster. This pod network is established and maintained by the OpenShift SDN, which configures an overlay network using Open vSwitch (OVS).

There are three different plug-ins available in OpenShift Container Platform 3.5 for configuring the pod network:

  • The redhat/ovs-subnet plug-in which provides a "flat" pod network where every pod can communicate with every other pod and service.
  • The redhat/ovs-multitenant plug-in which provides OpenShift Container Platform project level isolation for pods and services. Each project receives a unique Virtual Network ID (VNID) that identifies traffic from pods assigned to the project. Pods from different projects cannot send packets to or receive packets from pods and services of a different project.
  • The redhat/ovs-networkpolicy plug-in (currently in Tech Preview) allows project administrators to configure their own isolation policies using NetworkPolicy objects.

The plugin used in this reference architecture can be specified among the supported ones at deployment time using the ARM template. The default value is redhat/ovs-multitenant that allows multitenant isolation for pods per project.

For more information about OpenShift Container Platform networking, see OpenShift SDN documentation.

2.12. Microsoft Azure Network security groups

The purpose of the Microsoft Azure Network security groups (NSG) is to restrict traffic from outside of the VNet to servers inside of the VNet. The Network security groups also are used to restrict server to server communications inside the VNet. Network security groups provide an extra layer of security similar to a firewall: in the event a port is opened on an instance, the security group will not allow the communication to the port unless explicitly stated in a Network security group.

NSG are grouped depending on the traffic flow (inbound or outbound) and every NSG contains rules where every rule specify:

  • priority
  • source
  • destination
  • service (network port and network protocol)
  • action on the traffic (allow or deny)

NSG rules are processed by priority meaning the first rule matching the traffic it is applied.

All the security groups contains default rules to block connectivity coming from outside the VNet, where the default rules allow and disallow traffic as follows:

  • Virtual network: Traffic originating and ending in a virtual network is allowed both in inbound and outbound directions.
  • Internet: Outbound traffic is allowed, but inbound traffic is blocked.
  • Load balancer: Allow Microsoft Azure load balancer to probe the health of the VMs.

Once the Network security group is created, it should be associated to an infrastructure component where using the resource manager mechanism to deploy infrastructure in Microsoft Azure they can be associated to a NIC or a subnet.

Note

In this reference architecture, every VM is associated to a single NIC, and the Network security groups are associated to NICs, therefore there will be a 1:1:1 relationship between VM, NIC and Network security group. For more information about the Microsoft Azure Network security groups, see Filter network traffic with Network security groups

The Network security groups are specified on each node type json file, located in https://github.com/openshift/openshift-ansible-contrib/tree/master/reference-architecture/azure-ansible/ (like master.json for master instances)

2.12.1. Bastion Security Group

The bastion Network security group allows SSH connectivity from the outside to the bastion host. Any connectivity via SSH to the master, application or infrastructure nodes must go through the bastion host.

The Network security group applied to the bastion host NIC is called bastionnsg and contains the following rules:

NSG rule nameTypeSourceDestinationServiceAction

default-allow-ssh

Inbound

Any

Any

SSH (TCP/22)

Allow

2.12.2. Master Nodes Security Group

The master nodes Network security group allows inbound access on port 8443 from the internet to the virtual network. The traffic is then allowed to be forwarded to the master nodes.

The Network security group applied to every master node instances' NIC are called master1nsg, master2nsg and master3nsg and contain the following rules:

NSG rule nameTypeSourceDestinationServiceAction

default-allow-openshift-master

Inbound

Any

Any

Custom (TCP/8443)

Allow

2.12.3. Infrastructure nodes Security Group

The infrastructure nodes Network security group allows inbound access on port 80 and 443. If the applications running on the OpenShift Container Platform cluster are using different ports this can be adjusted as needed.

The Network security group applied to every infrastructure node instances' NIC are called infranode1nsg, infranode2nsg and infranode3nsg and contain the following rules:

NSG rule nameTypeSourceDestinationServiceAction

default-allow-openshift-router-http

Inbound

Any

Any

HTTP (TCP/80)

Allow

default-allow-openshift-router-https

Inbound

Any

Any

HTTPS (TCP/443)

Allow

2.13. Microsoft Azure DNS

DNS is an integral part of a successful OpenShift Container Platform environment. Microsoft Azure provides a DNS-as-a-Service called Azure DNS, per Microsoft; "The Microsoft global network of name servers has the scale and redundancy to ensure ultra-high availability for your domains. With Microsoft Azure DNS, you can be sure that your DNS will always be available."

Microsoft Azure provides the DNS for public zone, as well as internal host resolution. These are configured automatically during the execution of the reference architecture scripts.

Note

For more information see Azure DNS documentation

2.13.1. Public Zone

When the reference architecture is deployed, Microsoft Azure supplied domains are used. The domains consists of: <hostname>.<region>.cloudapp.azure.com.

For each OpenShift Container Platform deployment on Microsoft Azure, there are three created domain names:

  • <resourcegroup>.<region>.cloudapp.azure.com - The API and OpenShift Container Platform web console load balancer
  • <resourcegroup>b.<region>.cloudapp.azure.com - The bastion host for ssh access
  • <wildcardzone>.<region>.cloudapp.azure.com - The DNS of the applications load balancer
Important

Due to the current Microsoft Azure limitations on creating subdomains and wildcards, the nip.io service is used for the application load balancer. For more information about current Microsoft Azure limitations with subdomains, check the following link subdomain cloudapp.net rather than having a global namespace

Note

In order to have a proper wildcard DNS entry with a proper subdomain like *.apps.mycompany.com, it is recommended to create a wildcard A record externally with your DNS domain provider and configure it to the applications load balancer IP like: *.apps.mycompany.com. 300 IN A 192.168.133.2 To reflect those modifications in OpenShift Container Platform, modify the routingConfig.subdomain parameter in the /etc/origin/master/master-config.yaml file in all the masters and restart the atomic-openshift-master service, or modify the ansible hosts file and rerun the installation.

2.13.2. Internal resolution

This reference architecture uses Azure-provided name resolution mechanism which:

  • creates an internal subdomain per resource group like fesj5eh111uernc5jfpnxi33kh.dx.internal.cloudapp.net
  • creates an A DNS record on that internal subdomain of every instance deployed in that resource group
  • configures the proper resolution in the /etc/resolv.conf file on every VM

Using this, instances can be reached using just the VM shortname:

$ cat /etc/resolv.conf
# Generated by NetworkManager
search fesj5eh114uebnc5jfpnxi33kh.dx.internal.cloudapp.net
nameserver 168.63.129.16

$ nslookup master1
Server:		168.63.129.16
Address:	168.63.129.16#53

Name:	master1.fesj5eh114uebnc5jfpnxi33kh.dx.internal.cloudapp.net
Address: 10.0.0.5

2.13.3. Microsoft Azure VM Images

Azure Virtual Machines Images provide different virtual machine images to launch instances. In this reference architecture, the On-Demand Red Hat Enterprise Linux (RHEL) image in the Azure Marketplace is used.

Important

The Red Hat Enterprise Linux image carries an additional charge in addition to the base Linux VM price. For more information on Microsoft Azure pricing for Red Hat images see Azure Documentation.

2.13.4. Microsoft Azure VM Sizes

Microsoft Azure offers different VM sizes that can be used to deploy the OpenShift Container Platform environment. Further, all the nodes, have been selected with premium storage, to allow the best performance.

Note

The sizes provided in this reference architecture are a guide but it is advised to see the OpenShift Container Platform 3 Sizing Considerations for more information

The VM sizes are specified as parameters in the template file reference-architecture/azure-ansible/azuredeploy.json and the following table shows the specific parameter of each VM type and its default value:

Table 2.7. Default VM sizes

TypeParameterDefault size

Bastion

bastionVMSize

Standard_DS1_v2

Masters

masterVMSize

Standard_DS4_v2

Infrastructure nodes

infranodeVMSize

Standard_DS4_v2

Application nodes

nodeVMSize

Standard_DS4_v2

Application node VM size is an important parameter for selecting how many containers as well as how big containers will be. The current default value for application nodes size allocates 8 CPU cores and 28 Gigabytes of memory for each VM. If the containers are memory intensive then it is advised to either increase the node count, or increase the node memory size. For these applications, choosing a Standard_D14_v2 size will give 112 Gigabytes of memory or another VM size with more memory if needed.

2.13.5. Identity and Access Management

For this reference account, a Microsoft Azure account is required. Ideally this is either a pay-as-you-go account, or a Microsoft Enterprise Agreement.

You must have enough resources to deploy the reference architecture, otherwise the installation will fail.

During the installation of OpenShift Container Platform using the reference architecture scripts and playbooks, six storage accounts are created automatically per cluster. The following table shows the name of every storage account and its purpose:

Table 2.8. Storage accounts

NamePurpose

samas<resourcegroup>

Masters storage

sanod<resourcegroup>

Application nodes storage

sainf<resourcegroup>

Infrastructure nodes storage

sareg<resourcegroup>

Registry persistent volume storage

sapv<resourcegroup>

The generic storage class

sapvlm<resourcegroup>

Storage account where the metrics and logging volumes will be stored

Note

For more information about the Microsoft Azure identity management and storage accounts, see The fundamentals of Azure identity management and About Azure storage accounts

2.14. Bastion

The bastion server implements mainly two distinct functions. One is that of a secure way to connect to all the nodes, and second that of the "installer" of the system. The information provided to the ARM template is passed to the bastion host and from there, playbooks and scripts are automatically generated and executed, resulting in OpenShift Container Platform being installed.

As shown in the Figure 2.1, “bastion diagram” the bastion server in this reference architecture provides a secure way to limit SSH access to the Microsoft Azure environment. The master and node security groups only allow for SSH connectivity between nodes inside of the Security Group while the bastion allows SSH access from everywhere. The bastion host is the only ingress point for SSH in the cluster from external entities. When connecting to the OpenShift Container Platform infrastructure, the bastion forwards the request to the appropriate server.

Note

Connecting to other VMs through the bastion server requires specific SSH configuration which is outlined in the deployment section of the reference architecture guide.

Figure 2.1. bastion diagram

Bastion Server

2.15. Generated Inventory

Ansible relies on inventory files and variables to perform playbook runs. As part of the reference architecture provided Ansible playbooks, the inventory is created during the boot of the bastion host. The Azure Resource Manager templates (ARM) passes parameters via a script extension to RHEL on the bastion. On the bastion host a bastion.sh script generates the inventory file in /etc/ansible/hosts.

Dynamic Inventory Script within bastion.sh

[OSEv3:children]
masters
nodes
etcd
new_nodes
new_masters

[OSEv3:vars]
osm_controller_args={'cloud-provider': ['azure'], 'cloud-config': ['/etc/azure/azure.conf']}
osm_api_server_args={'cloud-provider': ['azure'], 'cloud-config': ['/etc/azure/azure.conf']}
openshift_node_kubelet_args={'cloud-provider': ['azure'], 'cloud-config': ['/etc/azure/azure.conf'], 'enable-controller-attach-detach': ['true']}
debug_level=2
console_port=8443
docker_udev_workaround=True
openshift_node_debug_level="{{ node_debug_level | default(debug_level, true) }}"
openshift_master_debug_level="{{ master_debug_level | default(debug_level, true) }}"
openshift_master_access_token_max_seconds=2419200
openshift_hosted_router_replicas=3
openshift_hosted_registry_replicas=3
openshift_master_api_port="{{ console_port }}"
openshift_master_console_port="{{ console_port }}"
openshift_override_hostname_check=true
osm_use_cockpit=false
openshift_release=v3.5
openshift_cloudprovider_kind=azure
openshift_node_local_quota_per_fsgroup=512Mi
azure_resource_group=${RESOURCEGROUP}
rhn_pool_id=${RHNPOOLID}
openshift_install_examples=true
deployment_type=openshift-enterprise
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
openshift_master_manage_htpasswd=false

os_sdn_network_plugin_name=${OPENSHIFTSDN}

# default selectors for router and registry services
openshift_router_selector='role=infra'
openshift_registry_selector='role=infra'

# Select default nodes for projects
osm_default_node_selector="role=app"
ansible_become=yes
ansible_ssh_user=${AUSERNAME}
remote_user=${AUSERNAME}

openshift_master_default_subdomain=${WILDCARDNIP}
osm_default_subdomain=${WILDCARDNIP}
openshift_use_dnsmasq=true
openshift_public_hostname=${RESOURCEGROUP}.${FULLDOMAIN}

openshift_master_cluster_method=native
openshift_master_cluster_hostname=${RESOURCEGROUP}.${FULLDOMAIN}
openshift_master_cluster_public_hostname=${RESOURCEGROUP}.${FULLDOMAIN}

openshift_metrics_install_metrics=false
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_cassandra_pvc_size="${METRICS_CASSANDRASIZE}G"
openshift_metrics_cassandra_replicas="${METRICS_INSTANCES}"
openshift_metrics_hawkular_nodeselector={"role":"infra"}
openshift_metrics_cassandra_nodeselector={"role":"infra"}
openshift_metrics_heapster_nodeselector={"role":"infra"}

openshift_logging_install_logging=false
openshift_logging_es_pv_selector={"usage":"elasticsearch"}
openshift_logging_es_pvc_dynamic="false"
openshift_logging_es_pvc_size="${LOGGING_ES_SIZE}G"
openshift_logging_es_cluster_size=${LOGGING_ES_INSTANCES}
openshift_logging_fluentd_nodeselector={"logging":"true"}
openshift_logging_es_nodeselector={"role":"infra"}
openshift_logging_kibana_nodeselector={"role":"infra"}
openshift_logging_curator_nodeselector={"role":"infra"}

openshift_logging_use_ops=false
openshift_logging_es_ops_pv_selector={"usage":"opselasticsearch"}
openshift_logging_es_ops_pvc_dynamic="false"
openshift_logging_es_ops_pvc_size="${OPSLOGGING_ES_SIZE}G"
openshift_logging_es_ops_cluster_size=${OPSLOGGING_ES_INSTANCES}
openshift_logging_es_ops_nodeselector={"role":"infra"}
openshift_logging_kibana_ops_nodeselector={"role":"infra"}
openshift_logging_curator_ops_nodeselector={"role":"infra"}

[masters]
master1 openshift_hostname=master1 openshift_node_labels="{'role': 'master'}"
master2 openshift_hostname=master2 openshift_node_labels="{'role': 'master'}"
master3 openshift_hostname=master3 openshift_node_labels="{'role': 'master'}"

[etcd]
master1
master2
master3

[new_nodes]
[new_masters]

[nodes]
master1 openshift_hostname=master1 openshift_node_labels="{'role':'master','zone':'default','logging':'true'}" openshift_schedulable=false
master2 openshift_hostname=master2 openshift_node_labels="{'role':'master','zone':'default','logging':'true'}" openshift_schedulable=false
master3 openshift_hostname=master3 openshift_node_labels="{'role':'master','zone':'default','logging':'true'}" openshift_schedulable=false
infranode1 openshift_hostname=infranode1 openshift_node_labels="{'role': 'infra', 'zone': 'default','logging':'true'}"
infranode2 openshift_hostname=infranode2 openshift_node_labels="{'role': 'infra', 'zone': 'default','logging':'true'}"
infranode3 openshift_hostname=infranode3 openshift_node_labels="{'role': 'infra', 'zone': 'default','logging':'true'}"
node[01:${NODECOUNT}] openshift_hostname=node[01:${NODECOUNT}] openshift_node_labels="{'role':'app','zone':'default','logging':'true'}"

Note

Those are the values chosen for the OpenShift Container Platform installation in this reference architecture, for more information about those parameters and their values, see the OpenShift documentation

For the OpenShift Container Platform installation, the ARM template collects the needed parameters, creates the virtual machines, and passes the parameters to the virtual machines, where a node type specific script in bash will take the parameters and generate the needed playbooks and automation. During this process each VM is assigned an ansible tag, that allows the playbooks to address the different node types.

Note

For more information about the automation procedures on Microsoft Azure, see Azure Linux Automation blog post and for more information about the Ansible inventory, see Ansible Host Inventory

2.16. Nodes

Nodes are Microsoft Azure instances that serve a specific purpose for OpenShift Container Platform. OpenShift Container Platform masters are also configured as nodes as they are part of the SDN. Nodes deployed on Microsoft Azure can be vertically scaled before or after the OpenShift Container Platform installation using the Microsoft Azure dashboard. There are three types of nodes as described below.

2.16.1. Master nodes

The master nodes contain the master components, including the API server, web console, controller manager server and etcd. The master maintains the cluster configuration within etcd, manages nodes in its OpenShift Container Platform cluster, assigns pods to nodes and synchronizes pod information with service configuration. The master is used to define routes, services, and volume claims for pods deployed within the OpenShift Container Platform environment. The users interact with the OpenShift Container Platform environment via the masters and using the API, web console or the oc command line interface.

Note

Even if master nodes would be able to run pods, they are configured as unschedulable to ensure that the masters are not burdened with running pods

2.16.2. Application nodes

The application nodes are the instances where non OpenShift Container Platform infrastructure based containers run. Depending on the application, Microsoft Azure specific storage can be applied such as an Azure VHD which can be assigned using a Persistent Volume Claim for application data that needs to persist between container restarts. A configuration parameter is set on the masters which ensures that OpenShift Container Platform user containers will be placed on the application nodes by default.

2.16.3. Infrastructure nodes

The infrastructure nodes are just regular nodes but with different labels so they are only used to host the optional infrastructure components for OpenShift Container Platform like the routers, registries, metrics or the aggregated logging to isolate those components of the regular applications. The storage for the registry deployed on the infrastructure nodes uses Azure Blob Storage which allows for multiple pods to use the same storage at the same time (ReadWriteMany or RWX). Since the registry does the lookup of the metadata, and then the download is handed off to azure to handle, this creates better scaling than other options.

Note

This reference architecture is emerging and components like the aggregated logging or metrics will be described in future revisions.

2.16.4. Node labels

All OpenShift Container Platform nodes are assigned a role label. This allows the scheduler decide of certain pods to be deployed on specific nodes.

LabelNodesPods

role=master

Master nodes

None

role=app

Application nodes

Application pods

role=infra

Infrastructure nodes

Infrastructure pods

Note

The configuration parameter 'defaultNodeSelector: "role=app" in /etc/origin/master/master-config.yaml file ensures all projects automatically are deployed on application nodes.

2.17. OpenShift Pods

OpenShift Container Platform leverages the Kubernetes concept of a pod, which is one or more containers deployed together on one host, and the smallest compute unit that can be defined, deployed, and managed.

A pod could be just a single container that runs a php application connecting to a database outside of the OpenShift Container Platform environment, or a pod could be two containers, one of them runs a php application and the other one runs an ephemeral database. Pods have the ability to be scaled at runtime or at the time of launch using the OpenShift Container Platform web console, the OpenShift Container Platform API or the oc CLI tool.

Note

The OpenShift Container Platform infrastructure components like the router and registry are deployed as pods in the OpenShift Container Platform environment in the installation procedure or after the installation. Even though they are not required for the OpenShift Container Platform environment to run, they provide very useful features so this reference architecture will assume they will be deployed.

Pods

For more information about the OpenShift Container Platform architecture and components, see the OpenShift Architecture

2.18. OpenShift Router

Pods inside of an OpenShift Container Platform cluster are only reachable via their IP addresses on the cluster network. To be able to access pods from outside the OpenShift Container Platform cluster, OpenShift Container Platform provides a few options:

  • Router
  • Load Balancer Service
  • Service ExternalIP
  • Service NodePort
  • Virtual IPs
  • Non-Cloud Edge Router Load Balancer
  • Edge Load Balancer

OpenShift Container Platform routers provide external hostname mapping and load balancing to services exposed by users over protocols that pass distinguishing information directly to the router; the hostname must be present in the protocol in order for the router to determine where to send it. Routers currently support the following protocols:

  • HTTP
  • HTTPS (with SNI)
  • WebSockets
  • TLS with SNI

There are two different router plug-ins that can be deployed in OpenShift Container Platform:

  • HAProxy template router
  • F5 router
Note

This reference architecture uses HAProxy template routers as the main mechanism to access the pods from outside the OpenShift Container Platform cluster. For more information on the different options, see Getting Traffic into the Cluster documentation and Router Overview

The HAProxy template router enable routes created by developers to be used by external clients. To avoid reconfiguration of the DNS servers every time a route is created, the suggested method is to define a wildcard DNS entry that will redirect every hostname to the router.

Note

For high availability purposes, this reference architecture deploys two router pods and creates an Azure Load Balancer which performs a health check and forwards traffic to router pods on port 80 and 443.

Important

Due to the current Microsoft Azure limitations on subdomains, the default wildcard entry uses the nip.io service. This can be modified after the installation and it is explained in the Microsoft Azure DNS section

2.19. Registry

OpenShift Container Platform can build container images from source code, deploy them, and manage their lifecycle. To enable this it is required to deploy the Integrated OpenShift Container Platform Registry

The registry stores container images and metadata. For production environment, persistent storage is recommended to be used by the registry, otherwise any images built or pushed into the registry would disappear if the pod were to restart.

The registry is scaled to 3 pods/instances to allow for HA, and load balancing. The default load balancing settings use the source ip address to enforce session stickiness. Failure of a pod may result in a pull or push operation to fail, but the operation may be restarted. The failed registry pod will be automatically restarted.

Using the installation methods described in this document the registry is deployed using Azure Blob Storage, an Microsoft Azure service that provides object storage. In order to use Azure Blob Storage the registry configuration has been extended. The procedure used is detailed in the OpenShift and Docker documentation, and it is to modify the registry deploymentconfig to add the Azure Blob Storage service details as:

$ oc env dc docker-registry -e REGISTRY_STORAGE=azure -e REGISTRY_STORAGE_AZURE_ACCOUNTNAME=<azure_storage_account_name> -e REGISTRY_STORAGE_AZURE_ACCOUNTKEY=<azure_storage_account_key> -e REGISTRY_STORAGE_AZURE_CONTAINER=registry

This will be done automatically as part of the installation by the scripts provided in the git repository.

Note

For more information about the Microsoft Azure Blob Storage, see: https://azure.microsoft.com/en-us/services/storage/blobs/

2.20. Authentication

There are several options when it comes to authentication of users in OpenShift Container Platform. OpenShift Container Platform can leverage an existing identity provider within an organization such as LDAP or can use external identity providers like GitHub, Google, and GitLab. The configuration of identification providers occurs on the OpenShift Container Platform master instances and multiple identity providers can be specified. The reference architecture document uses htpasswd as the authentication provider but any of the other mechanisms would be an acceptable choice. Roles can be customized and added to user accounts or groups to allow for extra privileges such as the ability to list nodes or assign persistent storage volumes to a project.

Note

For more information on htpasswd and other authentication methods see the Configuring authentication documentation.

Note

For best practice on authentication, consult the Red Hat Single Sign-On (SSO) documentation. Red Hat Single Sign-On (SSO) allows a fully federated central authentication service that can be used by both developers and end-users across multiple identity providers, using a simple user interface.

2.21. Microsoft Azure Storage

For the use cases considered in this reference architecture, including OpenShift Container Platform applications that connect to containerized databases or need some basic persistent storage, we need to consider multiple storage solutions in the cloud and different architectural approaches. Microsoft Azure offers a number of storage choices that offer high durability with three simultaneous replicas, including Standard and Premium storage.

Furthermore, a use case requirement is to implement "shared storage" where the volume should allow simultaneous read and write operations. Upon reviewing multiple options supported by OpenShift Container Platform and the underlying Red Hat Enterprise Linux infrastructure, a choice was made to use the Azure VHD based storage server to give the Microsoft Azure OpenShift Container Platform nodes the best match of performance and flexibility, and use Azure Blob Storage for the registry storage.

Note

The reference architecture will be updated with further storage choices (such as Managed Disks) as they are evaluated.

2.21.1. Microsoft Azure VHD

The Microsoft Azure cloud provider plugin for OpenShift Container Platform will dynamically allocate storage in the pre-created storage accounts based on requests for PV. In order to make this work, the OpenShift Container Platform environment should be configured properly, including creating a /etc/azure/azure.conf file with the Microsoft Azure account data, modifying the /etc/origin/master/master-config.yaml and /etc/origin/node/node-config.yaml files and restart the atomic-openshift-node and atomic-openshift-master services. The scripts provided with this reference architecture do this procedure automatically.

Note

For more information about the Microsoft Azure configuration for OpenShift Container Platform storage, see Configuring for Azure documentation

This reference architecture creates an install a default storage class, so any persistent volume claim will result in a new VHD being created in the selected storage account. The VHD will be created, mounted, formatted, and allocated to the container making the request.

Note

For more information about Azure VHD see About disks and VHDs for Azure Linux VMs documentation and see Dynamic Provisioning and Creating Storage Classes for more information about storage classes

2.22. Red Hat OpenShift Container Platform Metrics

Red Hat OpenShift Container Platform environments can be enriched by deploying an optional component named Red Hat OpenShift Container Platform metrics that collect metrics exposed by the kubelet from pods running in the environment and provides the ability to view CPU, memory, and network-based metrics and display the values in the user interface.

Note

Red Hat OpenShift Container Platform metrics is a required component for the horizontal pod autoscaling feature that allows the user to configure autoscaling pods on a certain capacity thresholds. For more information about pod autoscaling, see Pod Autoscaling.

Red Hat OpenShift Container Platform metrics it is composed by a few pods running on the Red Hat OpenShift Container Platform environment:

  • Heapster: Heapster scrapes the metrics for CPU, memory and network usage on every pod, then exports them into Hawkular Metrics.
  • Hawkular Metrics: A metrics engine which stores the data persistently in a Cassandra database.
  • Cassandra: Database where the metrics data is stored.

It is important to understand capacity planning when deploying metrics into an OpenShift environment regarding that one set of metrics pods (Cassandra/Hawkular/Heapster) is able to monitor at least 25,000 pods.

Red Hat OpenShift Container Platform metrics components can be customized for longer data persistence, pods limits, replicas of individual components, custom certificates, etc.

Note

For more information about different customization parameters, see Enabling Cluster Metrics documentation.

Within this reference environment, metrics are deployed optionally on the infrastructure nodes depending on the "metrics" parameter of the ARM template. When "true" is selected, it deploys one set of metric pods (Cassandra/Hawkular/Heapster) on the infrastructure nodes (to avoid using resources on the application nodes) and uses persistent storage to allow for metrics data to be preserved for 7 days.

2.22.1. Horizontal pod Autoscaler

If Red Hat OpenShift Container Platform metrics has been deployed the horizontal pod autoscaler feature can be used. A horizontal pod autoscaler, defined by a HorizontalPodAutoscaler object, specifies how the system should automatically increase or decrease the scale of a replication controller or deployment configuration, based on metrics collected from the pods that belong to that replication controller or deployment configuration.

Note

For more information about the pod autoscaling feature, see the official documentation

2.23. Red Hat OpenShift Container Platform Aggregated Logging

One of the Red Hat OpenShift Container Platform optional components named Red Hat OpenShift Container Platform aggregated logging collects and aggregates logs for a range of Red Hat OpenShift Container Platform services enabling Red Hat OpenShift Container Platform users view the logs of projects which they have view access using a web interface.

Red Hat OpenShift Container Platform aggregated logging component it is a modified version of the ELK stack composed by a few pods running on the Red Hat OpenShift Container Platform environment:

  • Elasticsearch: An object store where all logs are stored.
  • Fluentd: Gathers logs from nodes and feeds them to Elasticsearch.
  • Kibana: A web UI for Elasticsearch.
  • Curator: Elasticsearch maintenance operations performed automatically on a per-project basis.

Once deployed in a cluster, the stack aggregates logs from all nodes and projects into Elasticsearch, and provides a Kibana UI to view any logs. Cluster administrators can view all logs, but application developers can only view logs for projects they have permission to view. To avoid users to see logs from pods in other projects, the Search guard plugin for Elasticsearch is used.

A separate Elasticsearch cluster, a separate Kibana, and a separate Curator components can be deployed to form the OPS cluster where logs for the default, openshift, and openshift-infra projects as well as /var/log/messages on nodes are automatically aggregated and grouped into the .operations item in the Kibana interface.

Red Hat OpenShift Container Platform aggregated logging components can be customized for longer data persistence, pods limits, replicas of individual components, custom certificates, etc.

Note

For more information about different customization parameters, see Aggregating Container Logs documentation.

Within this reference environment, aggregated logging components are deployed optionally on the infrastructure nodes depending on the "logging" parameter of the ARM template. When "true" is selected, it deploys on the infrastructure nodes (to avoid using resources on the application nodes) the following elements:

  • 3 Elasticsearch replicas for HA using dedicated persistent volumes each one
  • Fluentd as a daemonset on all the nodes that includes the "logging=true" selector (all nodes and masters by default)
  • Kibana
  • Curator

Also, there is an "opslogging" parameter that can optionally deploy the same architecture but for operational logs:

  • 3 Elasticsearch replicas for HA using dedicated persistent volumes each one
  • Kibana
  • Curator
Note

Fluentd pods are configured automatically to split the logs for the two Elasticsearch clusters in case the ops cluster is deployed.

Table 2.9. Red Hat OpenShift Container Platform aggregated logging components

ParameterDeploy by defaultFluentdElasticsearchKibanaCurator

logging

true

Daemonset ("logging=true" selector)

3 replicas

1

1

opslogging

false

Shared

3 replicas

1

1