Chapter 1. Components and Considerations
1.1. Google Cloud Platform components
The successful installation of a Red Hat OpenShift Container Platform environment requires the following Google Cloud Platform components or services to create a highly-available and full featured environment.
1.1.1. Project
Google Cloud Platform project is the base level organizing entity. A project is required to use Google Cloud Platform, and forms the basis for creating, enabling and using all Google Cloud Platform services, managing APIs, enabling billing, adding and removing collaborators, and managing permissions.
Using the installation methods described in this document, the Red Hat OpenShift Container Platform environment is deployed in a single Google Cloud Platform project.
For more information, see Google Cloud Platform documentation on Project https://cloud.google.com/resource-manager/docs/cloud-platform-resource-hierarchy#projects
1.1.2. Cloud Identity and Access Management
Access control to the different infrastructure and services resources and fine-grained roles are available in Google Cloud Platform using the IAM service.
Service accounts with specific permissions can be created and used to deploy infrastructure components instead regular users as well as roles can be created to limit access to different users or service accounts.
Google Cloud Platform instances use service accounts to allow applications running on top to call Google Cloud Platform APIs, for instance, Red Hat OpenShift Container Platform nodes can call Google Cloud Platform disk API to provide a persistent volume to an application.
The deployment of Red Hat OpenShift Container Platform requires a user with proper permissions. The user must be able to create service accounts, cloud storage, instances, images, templates, Cloud DNS entries, and deploy load balancers and health checks. It is helpful to have delete permissions in order to be able to redeploy the environment while testing.
Using the installation methods described in this document, the Red Hat OpenShift Container Platform environment is deployed using a Google Cloud Platform service account.
For more information, see Google Cloud Platform documentation on IAM https://cloud.google.com/compute/docs/access/
1.1.3. Google Cloud Platform Regions and Zones
Google Cloud Platform has a global infrastructure that covers regions and availability zones. Per Google; "Certain Compute Engine resources live in regions or zones. A region is a specific geographical location where you can run your resources. Each region has one or more zones. For example, the us-central1 region denotes a region in the Central United States that has zones us-central1-a, us-central1-b, us-central1-c and us-central1-f."
Deploying Red Hat OpenShift Container Platform in Google Cloud Platform on different zones can be beneficial to avoid single-point-of-failures but there are some caveats regarding storage. Google Cloud Platform disks are created within a zone therefore if a Red Hat OpenShift Container Platform node goes down in zone "A" and the pods should be moved to zone "B", the persistent storage cannot be attached to those pods as the disks are in a different zone. See multiple zone limitations kubernetes documentation for more information.
This reference architecture can be deployed in a single region and single zone or multizone, depending on the use case required. Details in the implementation section of this document.
For more information, see Google Cloud Platform documentation on regions and zones https://cloud.google.com/compute/docs/zones
1.1.4. Google Cloud Platform Networking
Google Cloud Platform Networking encapsulates custom virtual networking components including subnets, IP addresses and firewalls.
1.1.4.1. VPC Networks
VPC network connects instances and other resources in a project. Projects can have multiple VPC networks and resources within a VPC network can communicate with other resources in the same VPC network using private IPv4 addresses (subject to firewall rules).
Using the installation methods described in this document, the Red Hat OpenShift Container Platform environment is deployed in a single VPC network.
For more information, see Google Cloud Platform documentation on VPC network https://cloud.google.com/vpc/docs/vpc
1.1.4.1.1. Subnets
Google Cloud Platform subnets are VPC Network partitions. VPC Network are global objects where subnets are regional.
Using the installation methods described in this document, the Red Hat OpenShift Container Platform environment is deployed in a custom subnet.
For more information, see Google Cloud Platform documentation on subnets https://cloud.google.com/vpc/docs/vpc#vpc_networks_and_subnets
1.1.4.1.2. Internal DNS
Google Cloud Platform VPC networks have an internal DNS service that automatically resolves internal hostnames.
The internal fully qualified domain name (FQDN) for an instance follows the [HOST_NAME].c.[PROJECT_ID].internal format.
For more information, see Google Cloud Platform documentation on Internal DNS https://cloud.google.com/compute/docs/internal-dns
1.1.4.1.3. Firewall Rules
Google Cloud Platform firewall rules allow/deny ingress/egress traffic and are applied at virtual networking level. This means they are not attached to instances but the filtering is based on tags or service accounts.
Using the installation methods described in this document, different firewall rules are created to ensure only required ports are exposed to required hosts.
For more information, see Google Cloud Platform documentation on firewall rules https://cloud.google.com/vpc/docs/firewalls
1.1.4.1.4. External IP address
For Google Cloud Platform instances to communicate with Internet an external IP address is required to be attached to the instance (another method is to use an outbound proxy). Also to communicate with instances deployed in Google Cloud Platform from outside the VPC Network, an external IP address is required.
Requiring an External IP address for internet access is a limitation of the provider. This reference architecture configures firewall rules to avoid incoming external traffic in instances if not needed.
For more information, see Google Cloud Platform documentation on external IP address https://cloud.google.com/compute/docs/ip-addresses/
1.1.4.2. Cloud DNS
Google Cloud Platform Cloud DNS is a DNS service to publish domain names to the global DNS using Google Cloud Platform DNS servers. The Public Cloud DNS zone requires a domain name either purchased through Google’s "Domains" service or through one of many 3rd party providers. Once the zone is created in Cloud DNS, the name servers provided by Google need to be added to the registrar.
Following this implementation, a Cloud DNS subdomain is used to configure 3 public dns names:
- Bastion
- Red Hat OpenShift Container Platform API
- Applications wildcard
For more information, see Google Cloud Platform documentation on Cloud DNS https://cloud.google.com/dns/overview
1.1.4.3. Load balancing
Google Cloud Platform Load Balancing service enables distribution of traffic across multiple instances in the Google Cloud Platform cloud.
There are five kinds of Load Balancing with some pros/cons:
- Internal - The forwarding rules implementation for the internal load balancer doesn’t fit the way Red Hat OpenShift Container Platform master service works. Red Hat OpenShift Container Platform master services queries the master service IP (hence, the load balancer IP) and if the master service is down in that node the internal load balancer doesn’t forward to any other master but itself. It is a "forwarding rule" from an external point of view but a "forward to yourself" from the backend instance point of view.
- Network load balancing - Relies on legacy HTTP Health checks for determining instance health.
- HTTP(S) load balancing - It requires using a custom certificate in the load balancer.
- SSL Proxy load balancing - Not recommended for HTTPS traffic. Limited port support.
- TCP Proxy load balancing - Requires 443/tcp to be used (doesn’t work with 8443)
To be able to use https health checks for master nodes (request /healthz status), https and tcp proxy load balancing are the only options available.
As the https load balancing requires a custom certificate, this reference architecture uses TCP Proxy load balancing to simplify the process.
TCP Proxy load balancing only allows port 443 for https, therefore the Red Hat OpenShift Container Platform API is exposed in the 443 port instead 8443.
Checking routers health properly (request /healthz status) is incompatible with HTTP(S) load balancing, SSL Proxy load balancing & TCP Proxy load balancing, so network load balancing is used for load balancing routers.
For more information, see Google Cloud Platform documentation on Load Balancing https://cloud.google.com/compute/docs/load-balancing-and-autoscaling
1.1.5. Google Cloud Platform Compute Engine
Instances, images and disks are the building blocks of the cloud computing IaaS model. Google Cloud Platform encapsulates these concepts in the Google Cloud Platform Compute Engine service.
1.1.5.1. Images
Google Cloud Platform images are preconfigured operating system images to create boot disks for the instances. Google Cloud Platform provides public images maintained by Google or third party vendors and custom images can be build and uploaded to Google Cloud Platform with customization.
Red Hat offers two possibilities to run Red Hat Enterprise Linux on certified cloud providers such as Google. There is an "hourly" option where the fee for the instance includes the Red Hat subscription, first and second level support for Red Hat Enterprise Linux is usually provided by the cloud provider. Those images are assembled by Google and need an additional agreement with Google for full technical support. Red Hat provides a third level support and the usual benefits of accessing erratas and documentation. Using such images gives the possibility to burst out into the cloud without the need of managing the fluctuating demand of subscriptions.
The second option is to use existing subscriptions in the Cloud Access program, paying only the bare system to the cloud provider. Specific for Google the image has to be assembled by the customer and uploaded to the Google cloud. Support is directly available by Red Hat under the conditions defined by the subscription. Subscriptions not visible in the Cloud Access enrollment form are not eligible for this program.
Following this implementation, the Cloud Access program is used by creating a Red Hat Enterprise Linux image and uploading it as a Google Cloud Platform custom image.
For more information, see Google Cloud Platform documentation on Images https://cloud.google.com/compute/docs/images and Operating System Details
1.1.5.2. Metadata
Google Cloud Platform metadata (data or information about other data) is key:value pairs of information used in the Google Cloud Platform instances.
Metadata can be assigned at both the project and instance level. Project level metadata propagates to all virtual machine instances within the project, while instance level metadata only impacts that instance.
Google Cloud Platform metadata is used also to store the ssh keys that are injected at boot time in the instances to allow ssh access.
For more information, see Google Cloud Platform documentation on Metadata https://cloud.google.com/compute/docs/storing-retrieving-metadata
1.1.5.3. Instances
Instances are virtual machines created from Google Cloud Platform images running in Google Cloud datacenters in different regions or zones.
Following this implementation, Red Hat Enterprise Linux instances are created to host Red Hat OpenShift Container Platform components.
For more information, see Google Cloud Platform documentation on Instances https://cloud.google.com/compute/docs/instances/
1.1.5.4. Instances sizes
A successful Red Hat OpenShift Container Platform environment requires some minimum hardware requirements. The next table shows the default sizes used in this implementation:
Table 1.1. Instances sizes
| Role | Size |
|---|---|
| Bastion |
|
| Master |
|
| Infrastructure node |
|
| Application node |
|
For more information about instance sizes, see https://cloud.google.com/compute/docs/machine-types and Red Hat OpenShift Container Platform Minimum Hardware Requirements
1.1.5.5. Storage Options
By default, each instance has a small root persistent disk that contains the operating system. When applications running on the instance require more storage space, additional storage options can be added to the instance:
- Standard Persistent Disks
- SSD Persistent Disks
- Local SSDs
- Cloud Storage Buckets
Following this implementation and for performance reasons, "SSD Persistent Disks" are used.
For more information, see Google Cloud Platform documentation on Storage Options https://cloud.google.com/compute/docs/disks/
1.1.6. Cloud Storage
Google Cloud Platform provides object cloud storage used by Red Hat OpenShift Container Platform to store container images using the Red Hat OpenShift Container Platform container registry.
Using the installation methods described in this document, the Red Hat OpenShift Container Platform container registry is deployed using a Google Cloud Storage (GCS) bucket.
For more information, see Google Cloud Platform documentation on Cloud Storagehttps://cloud.google.com/storage/docs/
1.2. Bastion Instance
Best practices recommend minimizing attack vectors into a system by exposing only those services required by consumers of the system. In the event of failure or a need for manual configuration, systems administrators require further access to internal components in the form of secure administrative back-doors.
In the case of Red Hat OpenShift Container Platform running in a cloud provider context, the entry points to the Red Hat OpenShift Container Platform infrastructure such as the API, Web Console and routers are the only services exposed to the outside. The systems administrators' access from the public network space to the private network is possible with the use of a bastion instance.
A bastion instance is a non-OpenShift instance accessible from outside of the Red Hat OpenShift Container Platform environment, configured to allow remote access via secure shell (ssh). To remotely access an instance, the systems administrator first accesses the bastion instance, then "jumps" via another ssh connection to the intended OpenShift instance. The bastion instance may be referred to as a "jump host".
As the bastion instance can access all internal instances, it is recommended to take extra measures to harden this instance’s security. For more information on hardening the bastion instance, see the official Guide to Securing Red Hat Enterprise Linux 7
Depending on the environment, the bastion instance may be an ideal candidate for running administrative tasks such as the Red Hat OpenShift Container Platform installation playbooks. This reference environment uses the bastion instance for the installation of the Red Hat OpenShift Container Platform.
1.3. Red Hat OpenShift Container Platform Components
Red Hat OpenShift Container Platform comprises of multiple instances running on Google Cloud Platform that allow for scheduled and configured OpenShift services and supplementary containers. These containers can have persistent storage, if required, by the application and integrate with optional OpenShift services such as logging and metrics.
1.3.1. OpenShift Instances
Instances running the Red Hat OpenShift Container Platform environment run the atomic-openshift-node service that allows for the container orchestration of scheduling pods. The following sections describe the different instance and their roles to develop a Red Hat OpenShift Container Platform solution.
The node instances run containers on behalf of its users and are separated into two functional classes: infrastructure and application nodes. The infrastructure (infra) nodes run the OpenShift router , OpenShift logging , OpenShift metrics , and the OpenShift registry while the application (app) nodes host the user container processes.
The Red Hat OpenShift Container Platform SDN requires Master instances to be considered nodes, therefore, all nodes run the atomic-openshift-node service.
The routers are an important component of Red Hat OpenShift Container Platform as they are the entry point to applications running in Red Hat OpenShift Container Platform. Developers can expose their applications running in Red Hat OpenShift Container Platform to be available from outside the Red Hat OpenShift Container Platform SDN creating routes that are published in the routers. Once the traffic reaches the router, the router forwards traffic to the containers on the app nodes using the Red Hat OpenShift Container Platform SDN.
In this reference architecture a set of three routers are deployed on the infra nodes for high availability purposes. In order to have a single entry point for applications, a Google Cloud Platform load balancer is created for load balancing the incoming traffic to the routers running in the infra nodes.
Table 1.2. Node Instance types
| Type | Node role |
|---|---|
| Master | Host some components such as the web interface |
| Infrastructure node | Host Red Hat OpenShift Container Platform infrastructure pods (registry, router, logging & metrics) |
| Application node | Host applications deployed on Red Hat OpenShift Container Platform |
1.3.1.1. Master Instances
The Master instances contains basic Red Hat OpenShift Container Platform components:
Table 1.3. Master Components1
| Component | Description |
|---|---|
| API Server | The Kubernetes API server validates and configures the data for pods, services, and replication controllers. It also assigns pods to nodes and synchronizes pod information with service configuration. |
|
|
|
| Controller Manager Server |
The controller manager server watches |
When using the native high availability method, master components have the following availability.
Table 1.4. Availability Matrix1
| Role | Style | Notes |
|---|---|---|
| API Server | Active-Active | Managed by Google Cloud Platform load balancer |
| Controller Manager Server | Active-Passive | One instance is elected as a cluster lead at a time |
1: OpenShift Documentation - Kubernetes Infrastructure
The master instances are considered nodes as well and run the atomic-openshift-node service.
For optimal performance, the etcd service should run on the masters instances. When collocating etcd with master nodes, at least three instances are required. In order to have a single entry-point for the API, the master nodes should be deployed behind a load balancer.
In order to create master instances with labels, set the following in the inventory file as:
... [OUTPUT ABBREVIATED] ...
[etcd]
master1.example.com
master2.example.com
master3.example.com
[masters]
master1.example.com
master2.example.com
master3.example.com
[nodes]
master1.example.com openshift_node_labels="{'region': 'master', 'masterlabel2': 'value2'}"
master2.example.com openshift_node_labels="{'region': 'master', 'masterlabel2': 'value2'}"
master3.example.com openshift_node_labels="{'region': 'master', 'masterlabel2': 'value2'}"
Ensure the openshift_web_console_nodeselector ansible variable value matches with a master node label in the inventory file. By default, the web_console is deployed to the masters.
See the official Red Hat OpenShift Container Platform documentation for a detailed explanation on master nodes.
1.3.1.2. Infrastructure Instances
The infrastructure instances run the atomic-openshift-node service and host the Red Hat OpenShift Container Platform components such as Registry, Prometheus and Hawkular metrics. The infrastructure instances also run the Elastic Search, Fluentd, and Kibana(EFK) containers for aggregate logging. Persistent storage should be available to the services running on these nodes.
Depending on environment requirements at least three infrastructure nodes are required to provide a sharded/highly available aggregated logging service and to ensure that service interruptions do not occur during a reboot.
For more infrastructure considerations, visit the official Red Hat OpenShift Container Platform documentation.
When creating infrastructure instances with labels, set the following in the inventory file as:
... [OUTPUT ABBREVIATED] ...
[nodes]
infra1.example.com openshift_node_labels="{'region': 'infra', 'infralabel1': 'value1'}"
infra2.example.com openshift_node_labels="{'region': 'infra', 'infralabel1': 'value1'}"
infra3.example.com openshift_node_labels="{'region': 'infra', 'infralabel1': 'value1'}"The router and registry pods automatically are scheduled on nodes with the label of 'region': 'infra'.
1.3.1.3. Application Instances
The Application (app) instances run the atomic-openshift-node service. These nodes should be used to run containers created by the end users of the OpenShift service.
When creating node instances with labels, set the following in the inventory file as:
... [OUTPUT ABBREVIATED] ...
[nodes]
node1.example.com openshift_node_labels="{'region': 'primary', 'nodelabel2': 'value2'}"
node2.example.com openshift_node_labels="{'region': 'primary', 'nodelabel2': 'value2'}"
node3.example.com openshift_node_labels="{'region': 'primary', 'nodelabel2': 'value2'}"1.3.2. etcd
etcd is a consistent and highly-available key value store used as Red Hat OpenShift Container Platform’s backing store for all cluster data. etcd stores the persistent master state while other components watch etcd for changes to bring themselves into the desired state.
Since values stored in etcd are critical to the function of Red Hat OpenShift Container Platform, firewalls should be implemented to limit the communication with etcd nodes. Inter-cluster and client-cluster communication is secured by utilizing x509 Public Key Infrastructure (PKI) key and certificate pairs.
etcd uses the RAFT algorithm to gracefully handle leader elections during network partitions and the loss of the current leader. For a highly available Red Hat OpenShift Container Platform deployment, an odd number (starting with three) of etcd instances are required.
1.3.3. Labels
Labels are key/value pairs attached to objects such as pods. They are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users but do not directly imply semantics to the core system. Labels can also be used to organize and select subsets of objects. Each object can have a set of labels defined at creation time or subsequently added and modified at any time.
Each key must be unique for a given object.
"labels": {
"key1" : "value1",
"key2" : "value2"
}Index and reverse-index labels are used for efficient queries, watches, sorting and grouping in UIs and CLIs, etc. Labels should not be polluted with non-identifying, large and/or structured data. Non-identifying information should instead be recorded using annotations.
1.3.3.1. Labels as Alternative Hierarchy
Service deployments and batch processing pipelines are often multi-dimensional entities (e.g., multiple partitions or deployments, multiple release tracks, multiple tiers, multiple micro-services per tier). Management of these deployments often requires cutting across the encapsulation of strictly hierarchical representations—especially those rigid hierarchies determined by the infrastructure rather than by users. Labels enable users to map their own organizational structures onto system objects in a loosely coupled fashion, without requiring clients to store these mappings.
Example labels:
{"release" : "stable", "release" : "canary"}
{"environment" : "dev", "environment" : "qa", "environment" : "production"}
{"tier" : "frontend", "tier" : "backend", "tier" : "cache"}
{"partition" : "customerA", "partition" : "customerB"}
{"track" : "daily", "track" : "weekly"}These are just examples of commonly used labels; the ability exists to develop specific conventions that best suit the deployed environment.
1.3.3.2. Labels as Node Selector
Node labels can be used as node selector where different nodes can be labeled to different use cases. The typical use case is to have nodes running Red Hat OpenShift Container Platform infrastructure components like the Red Hat OpenShift Container Platform registry, routers, metrics or logging components named "infrastructure nodes" to differentiate them from nodes dedicated to run user applications. Following this use case, the admin can label the "infrastructure nodes" with the label "region=infra" and the application nodes as "region=app". Other uses can be having different hardware in the nodes and have classifications like "type=gold", "type=silver" or "type=bronze".
The scheduler can be configured to use node labels to assign pods to nodes depending on the node-selector. At times it makes sense to have different types of nodes to run certain pods, the node-selector can be set to select which labels are used to assign pods to nodes.
1.4. Software Defined Networking
Red Hat OpenShift Container Platform offers the ability to specify how pods communicate with each other. This could be through the use of Red Hat provided Software-defined networks (SDN) or a third-party SDN.
Deciding on the appropriate internal network for an Red Hat OpenShift Container Platform step is a crucial step. Unfortunately, there is no right answer regarding the appropriate pod network to chose, as this varies based upon the specific scenario requirements on how a Red Hat OpenShift Container Platform environment is to be used.
For the purposes of this reference environment, the Red Hat OpenShift Container Platform ovs-networkpolicy SDN plug-in is chosen due to its ability to provide pod isolation using Kubernetes NetworkPolicy. The following section, “OpenShift SDN Plugins”, discusses important details when deciding between the three popular options for the internal networks - ovs-multitenant, ovs-networkpolicy and ovs-subnet.
1.4.1. OpenShift SDN Plugins
This section focuses on multiple plugins for pod communication within Red Hat OpenShift Container Platform using OpenShift SDN. The three plugin options are listed below.
-
ovs-subnet- the original plugin that provides an overlay network created to allow pod-to-pod communication and services. This pod network is created using Open vSwitch (OVS). -
ovs-multitenant- a plugin that provides an overlay network that is configured using OVS, similar to theovs-subnetplugin, however, unlike theovs-subnetit provides Red Hat OpenShift Container Platform project level isolation for pods and services. -
ovs-networkpolicy- a plugin that provides an overlay network that is configured using OVS that provides the ability for Red Hat OpenShift Container Platform administrators to configure specific isolation policies using NetworkPolicy objects1.
Network isolation is important, which OpenShift SDN to choose?
With the above, this leaves two OpenShift SDN options: ovs-multitenant and ovs-networkpolicy. The reason ovs-subnet is ruled out is due to it not having network isolation.
While both ovs-multitenant and ovs-networkpolicy provide network isolation, the optimal choice comes down to what type of isolation is required. The ovs-multitenant plugin provides project-level isolation for pods and services. This means that pods and services from different projects cannot communicate with each other.
On the other hand, ovs-networkpolicy solves network isolation by providing project administrators the flexibility to create their own network policies using Kubernetes NetworkPolicy objects. This means that by default all pods in a project are accessible from other pods and network endpoints until NetworkPolicy objects are created. This in turn may allow pods from separate projects to communicate with each other assuming the appropriate NetworkPolicy is in place.
Depending on the level of isolation required, should determine the appropriate choice when deciding between ovs-multitenant and ovs-networkpolicy.
1.5. Container Storage
Container images are stored locally on the nodes running Red Hat OpenShift Container Platform pods. The container-storage-setup script uses the /etc/sysconfig/docker-storage-setup file to specify the storage configuration.
Using this reference architecture, the container storage is configured at instance creation by creating a dedicated xfs filesystem to a dedicated disk attached to the instance. The prerequisites playbook executes container-storage-setup to configure the overlay2 storage driver on top of the dedicated xfs filesystem. endinf::[]
1.6. Persistent Storage
Containers by default offer ephemeral storage but some applications require the storage to persist between different container deployments or upon container migration. Persistent Volume Claims (PVC) are used to store the application data. These claims can either be added into the environment by hand or provisioned dynamically using a StorageClass object.
1.6.1. Storage Classes
The StorageClass resource object describes and classifies different types of storage that can be requested, as well as provides a means for passing parameters to the backend for dynamically provisioned storage on demand. StorageClass objects can also serve as a management mechanism for controlling different levels of storage and access to the storage. Cluster Administrators (cluster-admin) or Storage Administrators (storage-admin) define and create the StorageClass objects that users can use without needing any intimate knowledge about the underlying storage volume sources. Because of this the naming of the storage class defined in the StorageClass object should be useful in understanding the type of storage it maps whether that is storage from Google Cloud Platform or from glusterfs if deployed.
1.6.1.1. Persistent Volumes
Persistent volumes (PV) provide pods with non-ephemeral storage by configuring and encapsulating underlying storage sources. A persistent volume claim (PVC) abstracts an underlying PV to provide provider agnostic storage to OpenShift resources. A PVC, when successfully fulfilled by the system, mounts the persistent storage to a specific directory (mountPath) within one or more pods. From the container point of view, the mountPath is connected to the underlying storage mount points by a bind-mount.
1.7. Registry
OpenShift can build containerimages from source code, deploy them, and manage their lifecycle. To enable this, OpenShift provides an internal, integrated registry that can be deployed in the OpenShift environment to manage images.
The registry stores images and metadata. For production environment, persistent storage should be used for the registry, otherwise any images that were built or pushed into the registry would disappear if the pod were to restart.
1.8. Aggregated Logging
One of the Red Hat OpenShift Container Platform optional components named Red Hat OpenShift Container Platform aggregated logging collects and aggregates logs from the pods running in the Red Hat OpenShift Container Platform cluster as well as /var/log/messages on nodes enabling Red Hat OpenShift Container Platform users to view the logs of projects which they have view access using a web interface.
Red Hat OpenShift Container Platform aggregated logging component it is a modified version of the ELK stack composed by a few pods running on the Red Hat OpenShift Container Platform environment:
- Elasticsearch: An object store where all logs are stored.
- Kibana: A web UI for Elasticsearch.
- Curator: Elasticsearch maintenance operations performed automatically on a per-project basis.
- Fluentd: Gathers logs from nodes and containers and feeds them to Elasticsearch.
Fluentd can be configured to send a copy of the logs to a different log aggregator and/or to a different Elasticsearch cluster, see Red Hat OpenShift Container Platform documentation for more information.
Once deployed in the cluster, Fluentd (deployed as a DaemonSet on any node with the right labels) gathers logs from all nodes and containers, enriches the log document with useful metadata (e.g. namespace, container_name, node) and forwards them into Elasticsearch, where Kibana provides a web interface to users to be able to view any logs. Cluster administrators can view all logs, but application developers can only view logs for projects they have permission to view. To avoid users to see logs from pods in other projects, the Search Guard plugin for Elasticsearch is used.
A separate Elasticsearch cluster, a separate Kibana, and a separate Curator components can be deployed to form the OPS cluster where Fluentd send logs from the default, openshift, and openshift-infra projects as well as /var/log/messages on nodes into this different cluster. If the OPS cluster is not deployed those logs are hosted in the regular aggregated logging cluster.
Red Hat OpenShift Container Platform aggregated logging components can be customized for longer data persistence, pods limits, replicas of individual components, custom certificates, etc. The customization is provided by the Ansible variables as part of the deployment process.
The OPS cluster can be customized as well using the same variables using the suffix ops as in openshift_logging_es_ops_pvc_size.
For more information about different customization parameters, see Aggregating Container Logs documentation.
Basic concepts for aggregated logging
- Cluster: Set of Elasticsearch nodes distributing the workload
- Node: Container running an instance of Elasticsearch, part of the cluster.
- Index: Collection of documents (container logs)
- Shards and Replicas: Indices can be split into sets of data containing the primary copy of the documents stored (primary shards) or backups of that primary copies (replica shards). Sharding allows the application to horizontally scaled the information and distributed/paralellized operations. Replication instead provides high availability and also better search throughput as searches are also executed on replicas.
Using NFS storage as a volume or a persistent volume (or via NAS such as Gluster) is not supported for Elasticsearch storage, as Lucene relies on file system behavior that NFS does not supply. Data corruption and other problems can occur.
By default every Elasticsearch pod of the Red Hat OpenShift Container Platform aggregated logging components has the role of Elasticsearch master and Elasticsearch data node. If only 2 Elasticsearch pods are deployed and one of the pods fails, all logging stops until the second master returns, so there is no availability advantage to deploy 2 Elasticsearch pods.
Elasticsearch shards require their own storage, but Red Hat OpenShift Container Platform deploymentconfig shares storage volumes between all its pods, therefore every Elasticsearch pod is deployed using a different deploymentconfig so it cannot be scaled using oc scale. In order to scale the aggregated logging Elasticsearch replicas after the first deployment, it is required to modify the openshift_logging_es_cluser_size in the inventory file and re-run the openshift-logging.yml playbook.
Below is an example of some of the best practices when deploying Red Hat OpenShift Container Platform aggregated logging. Elasticsearch, Kibana, and Curator are deployed on nodes with the label of "region=infra". Specifying the node label ensures that the Elasticsearch and Kibana components are not competing with applications for resources. A highly-available environment for Elasticsearch is deployed to avoid data loss, therefore, at least 3 Elasticsearch replicas are deployed and openshift_logging_es_number_of_replicas parameter is configured to be 1 at least. The settings below would be defined in a variable file or static inventory.
openshift_logging_install_logging=true
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=100Gi
openshift_logging_es_cluster_size=3
openshift_logging_es_nodeselector={"region":"infra"}
openshift_logging_kibana_nodeselector={"region":"infra"}
openshift_logging_curator_nodeselector={"region":"infra"}
openshift_logging_es_number_of_replicas=11.9. Aggregated Metrics
Red Hat OpenShift Container Platform has the ability to gather metrics from kubelet and store the values in Heapster. Red Hat OpenShift Container Platform Metrics provide the ability to view CPU, memory, and network-based metrics and display the values in the user interface. These metrics can allow for the horizontal autoscaling of pods based on parameters provided by an Red Hat OpenShift Container Platform user. It is important to understand capacity planning when deploying metrics into an Red Hat OpenShift Container Platform environment.
Red Hat OpenShift Container Platform metrics is composed by a few pods running on the Red Hat OpenShift Container Platform environment:
- Heapster: Heapster scrapes the metrics for CPU, memory and network usage on every pod, then exports them into Hawkular Metrics.
- Hawkular Metrics: A metrics engine that stores the data persistently in a Cassandra database.
- Cassandra: Database where the metrics data is stored.
Red Hat OpenShift Container Platform metrics components can be customized for longer data persistence, pods limits, replicas of individual components, custom certificates, etc. The customization is provided by the Ansible variables as part of the deployment process.
As best practices when metrics are deployed, persistent storage should be used to allow for metrics to be preserved. Node selectors should be used to specify where the Metrics components should run. In the reference architecture environment, the components are deployed on nodes with the label of "region=infra".
openshift_metrics_install_metrics=True
openshift_metrics_storage_volume_size=20Gi
openshift_metrics_cassandra_storage_type=dynamic
openshift_metrics_hawkular_nodeselector={"region":"infra"}
openshift_metrics_cassandra_nodeselector={"region":"infra"}
openshift_metrics_heapster_nodeselector={"region":"infra"}1.10. Container-Native Storage (Optional)
Container-Native Storage (CNS) provides dynamically provisioned storage for containers on Red Hat OpenShift Container Platform across cloud providers, virtual and bare-metal deployments. CNS relies on block devices available on the OpenShift nodes and uses software-defined storage provided by Red Hat Gluster Storage. CNS runs Red Hat Gluster Storage containerized, allowing OpenShift storage pods to spread across the cluster and across different data centers if latency is low between them. CNS enables the requesting and mounting of Gluster storage across one or many containers with access modes of either ReadWriteMany(RWX), ReadOnlyMany(ROX) or ReadWriteOnce(RWO). CNS can also be used to host the OpenShift registry.
1.10.1. Prerequisites for Container-Native Storage
Deployment of Container-Native Storage (CNS) on OpenShift Container Platform (OCP) requires at least three OpenShift nodes with at least one 100GB unused block storage device attached on each of the nodes. Dedicating three OpenShift nodes to CNS allows for the configuration of one StorageClass object to be used for applications.
If the CNS instances serve dual roles such as hosting application pods and glusterfs pods, ensure the instances have enough resources to support both operations. CNS hardware requirements state that there must be 32GB of RAM per instance.
1.10.2. Firewall and Security Group Prerequisites
The following ports must be open to properly install and maintain CNS.
The nodes used for CNS also need all of the standard ports an OpenShift node would need.
Table 1.5. CNS - Inbound
| Port/Protocol | Services | Remote source | Purpose |
|---|---|---|---|
| 111/TCP | Gluster | Gluser Nodes | Portmap |
| 111/UDP | Gluster | Gluser Nodes | Portmap |
| 2222/TCP | Gluster | Gluser Nodes | CNS communication |
| 3260/TCP | Gluster | Gluser Nodes | Gluster Block |
| 24007/TCP | Gluster | Gluster Nodes | Gluster Daemon |
| 24008/TCP | Gluster | Gluster Nodes | Gluster Management |
| 24010/TCP | Gluster | Gluster Nodes | Gluster Block |
| 49152-49664/TCP | Gluster | Gluster Nodes | Gluster Client Ports |

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.