Chapter 4. Additional Concepts

4.1. Networking

Kubernetes ensures that pods are able to network with each other, and allocates each pod an IP address from an internal network. This ensures all containers within the pod behave as if they were on the same host. Giving each pod its own IP address means that pods can be treated like physical hosts or virtual machines in terms of port allocation, networking, naming, service discovery, load balancing, application configuration, and migration.

Creating links between pods is unnecessary. However, it is not recommended that you have a pod talk to another directly by using the IP address. Instead, we recommend that you create a service, then interact with the service.

4.1.1. OpenShift DNS

If you are running multiple services, such as frontend and backend services for use with multiple pods, in order for the frontend pods to communicate with the backend services, environment variables are created for user names, service IP, and more. If the service is deleted and recreated, a new IP address can be assigned to the service, and requires the frontend pods to be recreated in order to pick up the updated values for the service IP environment variable. Additionally, the backend service has to be created before any of the frontend pods to ensure that the service IP is generated properly and that it can be provided to the frontend pods as an environment variable.

For this reason, OpenShift has a built-in DNS so that the services can be reached by the service DNS as well as the service IP/port. OpenShift supports split DNS by running SkyDNS on the master that answers DNS queries for services. The master listens to port 53 by default.

When the node starts, the following message indicates the Kubelet is correctly resolved to the master:

0308 19:51:03.118430    4484 node.go:197] Started Kubelet for node
openshiftdev.local, server at 0.0.0.0:10250
I0308 19:51:03.118459    4484 node.go:199]   Kubelet is setting 10.0.2.15 as a
DNS nameserver for domain "local"

If the second message does not appear, the Kubernetes service may not be available.

On a node host, each Docker container’s nameserver has the master name added to the front, and the default search domain for the container will be .<pod_namespace>.cluster.local. The container will then direct any nameserver queries to the master before any other nameservers on the node, which is the default Docker behavior. The master will answer queries on the .cluster.local domain that have the following form:

Table 4.1. DNS Example Names

Object TypeExample

Default

<pod_namespace>.cluster.local

Services

<service>.<pod_namespace>.svc.cluster.local

Endpoints

<name>.<namespace>.endpoints.cluster.local

This prevents having to restart frontend pods in order to pick up new services, which creates a new IP for the service. This also removes the need to use environment variables, as pods can use the service DNS. Also, as the DNS does not change, you can reference database services as db.local in config files. Wildcard lookups are also supported, as any lookups resolve to the service IP, and removes the need to create the backend service before any of the frontend pods, since the service name (and hence DNS) is established upfront.

This DNS structure also covers headless services, where a portal IP is not assigned to the service and the kube-proxy does not load-balance or provide routing for its endpoints. Service DNS can still be used and responds with multiple A records, one for each pod of the service, allowing the client to round-robin between each pod.

4.1.2. OpenShift SDN

OpenShift deploys a software-defined networking (SDN) approach for connecting Docker containers in an OpenShift cluster. The OpenShift SDN connects all containers across all node hosts, providing a unified cluster network.

OpenShift SDN is automatically installed and configured as part of the Ansible-based installation procedure. Further administration should not be required; however, further details on the design and operation of OpenShift SDN are provided for those who are curious or need to troubleshoot problems.

4.2. OpenShift SDN

4.2.1. Overview

OpenShift uses a software-defined networking (SDN) approach to provide a unified cluster network that enables communication between containers across the OpenShift cluster. This cluster network is established and maintained by the OpenShift SDN, which configures an overlay network using Open vSwitch (OVS).

OpenShift SDN includes the ovssubnet SDN plug-in for configuring the network, which provides a "flat" pod network where every pod can communicate with every other pod and service.

Following is a detailed discussion of the design and operation of OpenShift SDN, which may be useful for troubleshooting.

4.2.2. Design on Masters

On an OpenShift master, OpenShift SDN maintains a registry of nodes, stored in etcd. When the system administrator registers a node, OpenShift SDN allocates an unused subnet from the cluster network and stores this subnet in the registry. When a node is deleted, OpenShift SDN deletes the subnet from the registry and considers the subnet available to be allocated again.

In the default configuration, the cluster network is the 10.1.0.0/16 class B network, and nodes are allocated /24 subnets (i.e., 10.1.0.0/24, 10.1.1.0/24, 10.1.2.0/24, and so on). This means that the cluster network has 256 subnets available to assign to nodes, and a given node is allocated 254 addresses that it can assign to the containers running on it. The size and address range of the cluster network are configurable, as is the host subnet size.

Note that OpenShift SDN on a master does not configure the local (master) host to have access to any cluster network. Consequently, a master host does not have access to containers via the cluster network, unless it is also running as a node.

4.2.3. Design on Nodes

On a node, OpenShift SDN first registers the local host with the SDN master in the aforementioned registry so that the master allocates a subnet to the node.

Next, OpenShift SDN creates and configures six network devices:

  • br0, the OVS bridge device that containers will be attached to. OpenShift SDN also configures a set of non-subnet-specific flow rules on this bridge. The ovssubnet plug-in waits to do so until the SDN master announces the creation of the new node subnet.
  • lbr0, a Linux bridge device, which is configured as Docker’s bridge and given the cluster subnet gateway address (eg, 10.1.x.1/24).
  • tun0, an OVS internal port (port 2 on br0). This also gets assigned the cluster subnet gateway address, and is used for external network access. OpenShift SDN configures netfilter and routing rules to enable access from the cluster subnet to the external network via NAT.
  • vlinuxbr and vovsbr, two Linux peer virtual Ethernet interfaces. vlinuxbr is added to lbr0 and vovsbr is added to br0 (port 9), to provide connectivity for containers created directly with Docker outside of OpenShift.
  • vxlan0, the OVS VXLAN device (port 1 on br0), which provides access to containers on remote nodes.

Each time a pod is started on the host, OpenShift SDN:

  1. moves the host side of the pod’s veth interface pair from the lbr0 bridge (where Docker placed it when starting the container) to the OVS bridge br0.
  2. adds OpenFlow rules to the OVS database to route traffic addressed to the new pod to the correct OVS port.

The pod is allocated an IP address in the cluster subnet by Docker itself because Docker is told to use the lbr0 bridge, which OpenShift SDN has assigned the cluster gateway address (eg. 10.1.x.1/24). Note that the tun0 is also assigned the cluster gateway IP address because it is the default gateway for all traffic destined for external networks, but these two interfaces do not conflict because the lbr0 interface is only used for IPAM and no OpenShift SDN pods are connected to it.

OpenShift SDN nodes also watch for subnet updates from the SDN master. When a new subnet is added, the node adds OpenFlow rules on br0 so that packets with a destination IP address the remote subnet go to vxlan0 (port 1 on br0) and thus out onto the network.

4.2.3.1. Packet Flow

Suppose we have two containers A and B where the peer virtual Ethernet device for container A’s eth0 is named vethA and the peer for container B’s eth0 is named vethB.

Note

If Docker’s use of peer virtual Ethernet devices is not already familiar to you, review Docker’s advanced networking documentation.

Now suppose first that container A is on the local host and container B is also on the local host. Then the flow of packets from container A to container B is as follows:

eth0 (in A’s netns) → vethAbr0vethBeth0 (in B’s netns)

Next, suppose instead that container A is on the local host and container B is on a remote host on the cluster network. Then the flow of packets from container A to container B is as follows:

eth0 (in A’s netns) → vethAbr0vxlan0 → network [1]vxlan0br0vethBeth0 (in B’s netns)

Finally, if container A connects to an external host, the traffic looks like:

eth0 (in A’s netns) → vethAbr0tun0 → (NAT) → eth0 (physical device) → Internet

Almost all packet delivery decisions are performed with OpenFlow rules in the OVS bridge br0, which simplifies the plug-in network architecture and provides flexible routing.

4.2.3.2. External Access to the Cluster Network

If a host that is external to OpenShift requires access to the cluster network, you have two options:

  1. Configure the host as an OpenShift node but mark it unschedulable so that the master does not schedule containers on it.
  2. Create a tunnel between your host and a host that is on the cluster network.

Both options are presented as part of a practical use-case in the documentation for configuring routing from an edge load-balancer to containers within OpenShift SDN.

4.3. Authentication

4.3.1. Overview

The authentication layer identifies the user associated with requests to the OpenShift API. The authorization layer then uses information about the requesting user to determine if the request should be allowed.

As an administrator, you can configure authentication using a master configuration file.

4.3.2. Users and Groups

A user in OpenShift is an entity that can make requests to the OpenShift API. Typically, this represents the account of a developer or administrator that is interacting with OpenShift.

A user can be assigned to one or more groups, each of which represent a certain set of users. Groups are useful when managing authorization policies to grant permissions to multiple users at once, for example allowing access to objects within a project, versus granting them to users individually.

In addition to explicitly defined groups, there are also system groups, or virtual groups, that are automatically provisioned by OpenShift. These can be seen when viewing cluster bindings.

In the default set of virtual groups, note the following in particular:

Virtual GroupDescription

system:authenticated

Automatically associated with any currently-authenticated users.

system:unauthenticated

Automatically associated with any currently-unauthenticated users.

4.3.3. API Authentication

Requests to the OpenShift API are authenticated using the following methods:

OAuth Access Tokens
  • Obtained from the OpenShift OAuth server using the <master>/oauth/authorize and <master>/oauth/token endpoints.
  • Sent as an Authorization: Bearer…​ header or an access_token=…​ query parameter
X.509 Client Certificates
  • Requires a HTTPS connection to the API server.
  • Verified by the API server against a trusted certificate authority bundle.
  • The API server creates and distributes certificates to controllers to authenticate themselves.

Any request with an invalid access token or an invalid certificate is rejected by the authentication layer with a 401 error.

If no access token or certificate is presented, the authentication layer assigns the system:anonymous virtual user and the system:unauthenticated virtual group to the request. This allows the authorization layer to determine which requests, if any, an anonymous user is allowed to make.

4.3.4. OAuth

The OpenShift master includes a built-in OAuth server. Users obtain OAuth access tokens to authenticate themselves to the API.

When a person requests a new OAuth token, the OAuth server uses the configured identity provider to determine the identity of the person making the request.

It then determines what user that identity maps to, creates an access token for that user, and returns the token for use.

OAuth Clients

Every request for an OAuth token must specify the OAuth client that will receive and use the token. The following OAuth clients are automatically created when starting the OpenShift API:

OAuth ClientUsage

openshift-web-console

Requests tokens for the web console.

openshift-browser-client

Requests tokens at <master>/oauth/token/request with a user-agent that can handle interactive logins.

openshift-challenging-client

Requests tokens with a user-agent that can handle WWW-Authenticate challenges.

To register additional clients:

$ oc create -f <(echo '
{
  "kind": "OAuthClient",
  "apiVersion": "v1",
  "metadata": {
    "name": "demo" 1
  },
  "secret": "...", 2
  "redirectURIs": [
    "http://www.example.com/" 3
  ]
}')
1
The name of the OAuth client is used as the client_id parameter when making requests to <master>/oauth/authorize and <master>/oauth/token.
2
The secret is used as the client_secret parameter when making requests to <master>/oauth/token.
3
The redirect_uri parameter specified in requests to <master>/oauth/authorize and <master>/oauth/token must be equal to (or prefixed by) one of the URIs in redirectURIs.

Integrations

All requests for OAuth tokens involve a request to <master>/oauth/authorize. Most authentication integrations place an authenticating proxy in front of this endpoint, or configure OpenShift to validate credentials against a backing identity provider.

Requests to <master>/oauth/authorize can come from user-agents that cannot display interactive login pages, such as the CLI. Therefore, OpenShift supports authenticating using a WWW-Authenticate challenge in addition to interactive login flows.

If an authenticating proxy is placed in front of the <master>/oauth/authorize endpoint, it should send unauthenticated, non-browser user-agents WWW-Authenticate challenges, rather than displaying an interactive login page or redirecting to an interactive login flow.

Note

To prevent cross-site request forgery (CSRF) attacks against browser clients, Basic authentication challenges should only be sent if a X-CSRF-Token header is present on the request. Clients that expect to receive Basic WWW-Authenticate challenges should set this header to a non-empty value.

If the authenticating proxy cannot support WWW-Authenticate challenges, or if OpenShift is configured to use an identity provider that does not support WWW-Authenticate challenges, users can visit <master>/oauth/token/request using a browser to obtain an access token manually.

Obtaining OAuth Tokens

The OAuth server supports standard authorization code grant and the implicit grant OAuth authorization flows.

When requesting an OAuth token using the implicit grant flow (response_type=token) with a client_id configured to request WWW-Authenticate challenges (like openshift-challenging-client), these are the possible server responses from /oauth/authorize, and how they should be handled:

StatusContentClient response

302

Location header containing an access_token parameter in the URL fragment (RFC 4.2.2)

Use the access_token value as the OAuth token

302

Location header containing an error query parameter (RFC 4.1.2.1)

Fail, optionally surfacing the error (and optional error_description) query values to the user

302

Other Location header

Follow the redirect, and process the result using these rules

401

WWW-Authenticate header present

Respond to challenge if type is recognized (e.g. Basic, Negotiate, etc), resubmit request, and process the result using these rules

401

WWW-Authenticate header missing

No challenge authentication is possible. Fail and show response body (which might contain links or details on alternate methods to obtain an OAuth token)

Other

Other

Fail, optionally surfacing response body to the user

4.4. Authorization

4.4.1. Overview

Authorization policies determine whether a user is allowed to perform a given action within a project. This allows platform administrators to use the cluster policy to control who has various access levels to the OpenShift platform itself and all projects. It also allows developers to use local policy to control who has access to their projects. Note that authorization is a separate step from authentication, which is more about determining the identity of who is taking the action.

Authorization is managed using:

Rules

Sets of permitted verbs on a set of objects. For example, whether something can create pods.

Roles

Collections of rules. Users and groups can be associated with, or bound to, multiple roles at the same time.

Bindings

Associations between users and/or groups with a role.

Rules, roles, and bindings can be visualized using the CLI. For example, consider the following excerpt from viewing a policy, showing rule sets for the admin and basic-userdefault roles:

admin			Verbs					Resources															Resource Names	Extension
			[create delete get list update watch]	[projects resourcegroup:exposedkube resourcegroup:exposedopenshift resourcegroup:granter secrets]				[]
			[get list watch]			[resourcegroup:allkube resourcegroup:allkube-status resourcegroup:allopenshift-status resourcegroup:policy]			[]
basic-user		Verbs					Resources															Resource Names	Extension
			[get]					[users]																[~]
			[list]					[projectrequests]														[]
			[list]					[projects]															[]
			[create]				[subjectaccessreviews]														[]		IsPersonalSubjectAccessReview

The following excerpt from viewing policy bindings shows the above roles bound to various users and groups:

RoleBinding[admins]:
				Role:	admin
				Users:	[alice system:admin]
				Groups:	[]
RoleBinding[basic-user]:
				Role:	basic-user
				Users:	[joe]
				Groups:	[devel]

4.4.2. Evaluating Authorization

Several factors are combined to make the decision when OpenShift evaluates authorization:

Identity

In the context of authorization, both the user name and list of groups the user belongs to.

Action

The action being performed. In most cases, this consists of:

Project

The project being accessed.

Verb

Can be get, list, create, update, delete, or watch.

Resource Name

The API endpoint being accessed.

Bindings

The full list of bindings.

OpenShift evaluates authorizations using the following steps:

  1. The identity and the project-scoped action is used to find all bindings that apply to the user or their groups.
  2. Bindings are used to locate all the roles that apply.
  3. Roles are used to find all the rules that apply.
  4. The action is checked against each rule to find a match.
  5. If no matching rule is found, the action is then denied by default.

4.4.3. Cluster Policy and Local Policy

There are two levels of authorization policy:

Cluster policy

Roles and bindings that are applicable across all projects. Roles that exist in the cluster policy are considered cluster roles. Cluster bindings can only reference cluster roles.

Local policy

Roles and bindings that are scoped to a given project. Roles that exist only in a local policy are considered local roles. Local bindings can reference both cluster and local roles.

This two-level hierarchy allows re-usability over multiple projects through the cluster policy while allowing customization inside of individual projects through local policies.

During evaluation, both the cluster bindings and the local bindings are used. For example:

  1. Cluster-wide "allow" rules are checked.
  2. Locally-bound "allow" rules are checked.
  3. Deny by default.

4.4.4. Roles

Roles are collections of policy rules, which are sets of permitted verbs that can be performed on a set of resources. OpenShift includes a set of default roles that can be added to users and groups in the cluster policy or in a local policy.

Default RoleDescription

admin

A project manager. If used in a local binding, an admin user will have rights to view any resource in the project and modify any resource in the project except for role creation and quota. If the cluster-admin wants to allow an admin to modify roles, the cluster-admin must create a project-scoped Policy object using JSON.

basic-user

A user that can get basic information about projects and users.

cluster-admin

A super-user that can perform any action in any project. When granted to a user within a local policy, they have full control over quota and roles and every action on every resource in the project.

cluster-status

A user that can get basic cluster status information.

edit

A user that can modify most objects in a project, but does not have the power to view or modify roles or bindings.

self-provisioner

A user that can create their own projects.

view

A user who cannot make any modifications, but can see most objects in a project. They cannot view or modify roles or bindings.

Tip

Remember that users and groups can be associated with, or bound to, multiple roles at the same time.

These roles, including a matrix of the verbs and resources each are associated with, can be visualized in the cluster policy by using the CLI to view the cluster roles. Additional system: roles are listed as well, which are used for various OpenShift system and component operations.

By default in a local policy, only the binding for the admin role is immediately listed when using the CLI to view local bindings. However, if other default roles are added to users and groups within a local policy, they become listed in the CLI output, as well.

If you find that these roles do not suit you, a cluster-admin user can create a policyBinding object named <projectname>:default with the CLI using a JSON file. This allows the project admin to bind users to roles that are defined only in the <projectname> local policy.

4.4.4.1. Updating Cluster Roles

After any OpenShift cluster upgrade, the recommended default roles may have been updated. See the Administrator Guide for instructions on updating the policy definitions to the new recommendations using:

$ oadm policy reconcile-cluster-roles

4.4.5. Security Context Constraints

In addition to authorization policies that control what a user can do, OpenShift provides security context constraints (SCC) that control the actions that a pod can perform and what it has the ability to access. Administrators can manage SCCs using the CLI.

SCCs are objects that define a set of conditions that a pod must run with in order to be accepted into the system. They allow an administrator to control the following:

  1. Running of privileged containers.
  2. Capabilities a container can request to be added.
  3. Use of host directories as volumes.
  4. The SELinux context of the container.
  5. The user ID.

Two SCCs are added to the cluster by default, privileged and restricted, which are viewable by cluster administrators using the CLI:

$ oc get scc
NAME         PRIV      CAPS      HOSTDIR   SELINUX     RUNASUSER
privileged   true      []        true      RunAsAny    RunAsAny
restricted   false     []        false     MustRunAs   MustRunAsRange

The definition for each SCC is also viewable by cluster administrators using the CLI. For example, for the privileged SCC:

# oc export scc/privileged
allowHostDirVolumePlugin: true
allowPrivilegedContainer: true
apiVersion: v1
groups: 1
- system:cluster-admins
- system:nodes
kind: SecurityContextConstraints
metadata:
  creationTimestamp: null
  name: privileged
runAsUser:
  type: RunAsAny 2
seLinuxContext:
  type: RunAsAny 3
users: 4
- system:serviceaccount:openshift-infra:build-controller
1
The groups that have access to this SCC
2
The run as user strategy type which dictates the allowable values for the Security Context
3
The SELinux context strategy type which dictates the allowable values for the Security Context
4
The users who have access to this SCC

The users and groups fields on the SCC control which SCCs can be used. By default, cluster administrators, nodes, and the build controller are granted access to the privileged SCC. All authenticated users are granted access to the restricted SCC.

The privileged SCC:

  • allows privileged pods.
  • allows host directories to be mounted as volumes.
  • allows a pod to run as any user.
  • allows a pod to run with any MCS label.

The restricted SCC:

  • ensures pods cannot run as privileged.
  • ensures pods cannot use host directory volumes.
  • requires that a pod run as a user in a pre-allocated range of UIDs.
  • requires that a pod run with a pre-allocated MCS label.

SCCs are comprised of settings and strategies that control the security features a pod has access to. These settings fall into three categories:

Controlled by a boolean

Fields of this type default to the most restrictive value. For example, AllowPrivilegedContainer is always set to false if unspecified.

Controlled by an allowable set

Fields of this type are checked against the set to ensure their value is allowed.

Controlled by a strategy

Items that have a strategy to generate a value provide:

  • A mechanism to generate the value, and
  • A mechanism to ensure that a specified value falls into the set of allowable values.

4.4.5.1. Admission

Admission control with SCCs allows for control over the creation of resources based on the capabilities granted to a user.

In terms of the SCCs, this means that an admission controller can inspect the user information made available in the context to retrieve an appropriate set of SCCs. Doing so ensures the pod is authorized to make requests about its operating environment or to generate a set of constraints to apply to the pod.

The set of SCCs that admission uses to authorize a pod are determined by the user identity and groups that the user belongs to. Additionally, if the pod specifies a service account, the set of allowable SCCs includes any constraints accessible to the service account.

Admission uses the following approach to create the final security context for the pod:

  1. Retrieve all SCCs available for use.
  2. Generate field values for any security context setting that was not specified on the request.
  3. Validate the final settings against the available constraints.

If a matching set of constraints is found, then the pod is accepted. If the request cannot be matched to an SCC, the pod is rejected.

4.5. Persistent Storage

4.5.1. Overview

Managing storage is a distinct problem from managing compute resources. OpenShift leverages the Kubernetes PersistentVolume subsystem, which provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed. This subsystem uses the PersistentVolume and PersistentVolumeClaim API objects.

A PersistentVolume (PV) object represents a piece of existing networked storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plug-ins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. PV objects capture the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.

Important

High-availability of storage in the infrastructure is left to the underlying storage provider.

A PersistentVolumeClaim (PVC) object represents a request for storage by a user. It is similar to a pod in that pods consume node resources and PVCs consume PV resources. For example, pods can request specific levels of resources (e.g., CPU and memory), while PVCs can request specific storage capacity and access modes (e.g, they can be mounted once read/write or many times read-only).

4.5.2. Lifecycle of a Volume and Claim

PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs have the following lifecycle.

4.5.2.1. Provisioning

A cluster administrator creates some number of PVs. They carry the details of the real storage that is available for use by cluster users. They exist in the API and are available for consumption.

4.5.2.2. Binding

A user creates a PersistentVolumeClaim with a specific amount of storage requested and with certain access modes. A control loop in the master watches for new PVCs, finds a matching PV (if possible), and binds them together. The user will always get at least what they asked for, but the volume may be in excess of what was requested.

Claims remain unbound indefinitely if a matching volume does not exist. Claims are bound as matching volumes become available. For example, a cluster provisioned with many 50Gi volumes would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.

4.5.2.3. Using

Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a pod. For those volumes that support multiple access modes, the user specifies which mode is desired when using their claim as a volume in a pod.

Once a user has a claim and that claim is bound, the bound PV belongs to the user for as long as they need it. Users schedule pods and access their claimed PVs by including a persistentVolumeClaim in their pod’s volumes block. See below for syntax details.

4.5.2.4. Releasing

When a user is done with a volume, they can delete the PVC object from the API which allows reclamation of the resource. The volume is considered "released" when the claim is deleted, but it is not yet available for another claim. The previous claimant’s data remains on the volume which must be handled according to policy.

4.5.2.5. Reclaiming

The reclaim policy of a PersistentVolume tells the cluster what to do with the volume after it is released. Currently, volumes can either be retained or recycled.

Retention allows for manual reclamation of the resource. For those volume plug-ins that support it, recycling performs a basic scrub on the volume (e.g., rm -rf /<volume>/*) and makes it available again for a new claim.

4.5.3. Persistent Volumes

Each PV contains a spec and status, which is the specification and status of the volume.

Example 4.1. Persistent Volume Object Definition

  apiVersion: v1
  kind: PersistentVolume
  metadata:
    name: pv0003
  spec:
    capacity:
      storage: 5Gi
    accessModes:
      - ReadWriteOnce
    persistentVolumeReclaimPolicy: Recycle
    nfs:
      path: /tmp
      server: 172.17.0.2

4.5.3.1. Types of Persistent Volumes

OpenShift Enterprise currently supports the following PersistentVolume plug-ins:

More plug-ins are available but are currently in Technology Preview:

4.5.3.2. Capacity

Generally, a PV will have a specific storage capacity. This is set using the PV’s capacity attribute. See the Kubernetes Resource Model to understand the units expected by capacity.

Currently, storage capacity is the only resource that can be set or requested. Future attributes may include IOPS, throughput, etc.

4.5.3.3. Access Modes

A PersistentVolume can be mounted on a host in any way supported by the resource provider. Providers will have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.

The access modes are:

Access ModeCLI AbbreviationDescription

ReadWriteOnce

RWO

The volume can be mounted as read-write by a single node.

ReadOnlyMany

ROX

The volume can be mounted read-only by many nodes.

ReadWriteMany

RWX

The volume can be mounted as read-write by many nodes.

Important

A volume can only be mounted using one access mode at a time, even if it supports many. For example, a GCE Persistent Disk can be mounted as ReadWriteOnce by a single node or ReadOnlyMany by many nodes, but not at the same time.

4.5.3.4. Recycling Policy

The current recycling policies are:

Recycling PolicyDescription

Retain

Manual reclamation

Recycle

Basic scrub (e.g, rm -rf /<volume>/*)

Currently, NFS and HostPath support recycling.

4.5.3.5. Phase

A volumes can be found in one of the following phases:

PhaseDescription

Available

A free resource that is not yet bound to a claim.

Bound

The volume is bound to a claim.

Released

The claim has been deleted, but the resource is not yet reclaimed by the cluster.

Failed

The volume has failed its automatic reclamation.

The CLI shows the name of the PVC bound to the PV.

4.5.4. Persistent Volume Claims

Each PVC contains a spec and status, which is the specification and status of the claim.

Example 4.2. Persistent Volume Claim Object Definition

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi

4.5.4.1. Access Modes

Claims use the same conventions as volumes when requesting storage with specific access modes.

4.5.4.2. Resources

Claims, like pods, can request specific quantities of a resource. In this case, the request is for storage. The same resource model applies to both volumes and claims.

4.5.4.3. Claims As Volumes

Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the pod using the claim. The cluster finds the claim in the pod’s namespace and uses it to get the PersistentVolume backing the claim. The volume is then mounted to the host and into the pod:

kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: dockerfile/nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: myclaim

4.6. Remote Commands

4.6.1. Overview

OpenShift takes advantage of a feature built into Kubernetes to support executing commands in containers. This is implemented using HTTP along with a multiplexed streaming protocol such as SPDY or HTTP/2.

Developers can use the CLI to execute remote commands in containers.

4.6.2. Server Operation

The Kubelet handles remote execution requests from clients. Upon receiving a request, it upgrades the response, evaluates the request headers to determine what streams (stdin, stdout, and/or stderr) to expect to receive, and waits for the client to create the streams.

After the Kubelet has received all the streams, it executes the command in the container, copying between the streams and the command’s stdin, stdout, and stderr, as appropriate. When the command terminates, the Kubelet closes the upgraded connection, as well as the underlying one.

Architecturally, there are options for running a command in a container. The supported implementation currently in OpenShift invokes nsenter directly on the node host to enter the container’s namespaces prior to executing the command. However, custom implementations could include using docker exec, or running a "helper" container that then runs nsenter so that nsenter is not a required binary that must be installed on the host.

4.7. Port Forwarding

4.7.1. Overview

OpenShift takes advantage of a feature built into Kubernetes to support port forwarding to pods. This is implemented using HTTP along with a multiplexed streaming protocol such as SPDY or HTTP/2.

Developers can use the CLI to port forward to a pod. The CLI listens on each local port specified by the user, forwarding via the described protocol.

4.7.2. Server Operation

The Kubelet handles port forward requests from clients. Upon receiving a request, it upgrades the response and waits for the client to create port forwarding streams. When it receives a new stream, it copies data between the stream and the pod’s port.

Architecturally, there are options for forwarding to a pod’s port. The supported implementation currently in OpenShift invokes nsenter directly on the node host to enter the pod’s network namespace, then invokes socat to copy data between the stream and the pod’s port. However, a custom implementation could include running a "helper" pod that then runs nsenter and socat, so that those binaries are not required to be installed on the host.

4.8. Throttling

4.8.1. Overview

OpenShift clusters will orchestrate many potentially large applications that could be co-located on a set of shared nodes. Throttling refers to the act of controlling pod start order and resource consumption to provide:

  1. Optimal start-up time when the system has to start large numbers of pods at once
  2. Resource control so that a single container cannot monopolize the resources of an entire node

4.9. Source Control Management

OpenShift takes advantage of preexisting source control management (SCM) systems hosted either internally (such as an in-house Git server) or externally (for example, on GitHub, Bitbucket, etc.). Currently, OpenShift only supports Git solutions.

SCM integration is tightly coupled with builds, the two points being:

  • Creating a BuildConfig using a repository, which allows building your application inside of OpenShift. You can create a BuildConfigmanually or let OpenShift create it automatically by inspecting your repository.
  • Triggering a build upon repository changes.

4.10. Other API Objects

4.10.1. LimitRange

A limit range provides a mechanism to enforce min/max limits placed on resources in a Kubernetes namespace.

By adding a limit range to your namespace, you can enforce the minimum and maximum amount of CPU and Memory consumed by an individual pod or container.

See the Kubernetes documentation for more information.

4.10.2. ResourceQuota

Kubernetes can limit both the number of objects created in a namespace, and the total amount of resources requested across objects in a namespace. This facilitates sharing of a single Kubernetes cluster by several teams, each in a namespace, as a mechanism of preventing one team from starving another team of cluster resources.

See the Developer’s Guide and Kubernetes documentation for more information on ResourceQuota.

4.10.3. Resource

A Kubernetes Resource is something that can be requested by, allocated to, or consumed by a pod or container. Examples include memory (RAM), CPU, disk-time, and network bandwidth.

See the Developer’s Guide and Kubernetes documentation for more information.

4.10.4. Secret

Secrets are storage for sensitive information, such as keys, passwords, and certificates. They are accessible by the intended pod(s), but held separately from their definitions.

4.10.5. PersistentVolume

A persistent volume is an object (PersistentVolume) in the infrastructure provisioned by the cluster administrator. Persistent volumes provide durable storage for stateful applications.

See the Kubernetes documentation for more information.

4.10.6. PersistentVolumeClaim

A PersistentVolumeClaim object is a request for storage by a pod author. Kubernetes matches the claim against the pool of available volumes and binds them together. The claim is then used as a volume by a pod. Kubernetes makes sure the volume is available on the same node as the pod that requires it.

See the Kubernetes documentation for more information.

4.10.7. OAuth Objects

4.10.7.1. OAuthClient

An OAuthClient represents an OAuth client, as described in RFC 6749, section 2.

The following OAuthClient objects are automatically created:

openshift-web-console

Client used to request tokens for the web console

openshift-browser-client

Client used to request tokens at /oauth/token/request with a user-agent that can handle interactive logins

openshift-challenging-client

Client used to request tokens with a user-agent that can handle WWW-Authenticate challenges

Example 4.3. OAuthClient Object Definition

{
  "kind": "OAuthClient",
  "apiVersion": "v1",
  "metadata": {
    "name": "openshift-web-console", 1
    "selfLink": "/osapi/v1/oAuthClients/openshift-web-console",
    "resourceVersion": "1",
    "creationTimestamp": "2015-01-01T01:01:01Z"
  },
  "respondWithChallenges": false, 2
  "secret": "45e27750-a8aa-11e4-b2ea-3c970e4b7ffe", 3
  "redirectURIs": [
    "https://localhost:8443" 4
  ]
}
1
The name is used as the client_id parameter in OAuth requests.
2
When respondWithChallenges is set to true, unauthenticated requests to /oauth/authorize will result in WWW-Authenticate challenges, if supported by the configured authentication methods.
3
The value in the secret parameter is used as the client_secret parameter in an authorization code flow.
4
One or more absolute URIs can be placed in the redirectURIs section. The redirect_uri parameter sent with authorization requests must be prefixed by one of the specified redirectURIs.

4.10.7.2. OAuthClientAuthorization

An OAuthClientAuthorization represents an approval by a User for a particular OAuthClient to be given an OAuthAccessToken with particular scopes.

Creation of OAuthClientAuthorization objects is done during an authorization request to the OAuth server.

Example 4.4. OAuthClientAuthorization Object Definition

{
  "kind": "OAuthClientAuthorization",
  "apiVersion": "v1",
  "metadata": {
    "name": "bob:openshift-web-console",
    "resourceVersion": "1",
    "creationTimestamp": "2015-01-01T01:01:01-00:00"
  },
  "clientName": "openshift-web-console",
  "userName": "bob",
  "userUID": "9311ac33-0fde-11e5-97a1-3c970e4b7ffe"
  "scopes": []
}

4.10.7.3. OAuthAuthorizeToken

An OAuthAuthorizeToken represents an OAuth authorization code, as described in RFC 6749, section 1.3.1.

An OAuthAuthorizeToken is created by a request to the /oauth/authorize endpoint, as described in RFC 6749, section 4.1.1.

An OAuthAuthorizeToken can then be used to obtain an OAuthAccessToken with a request to the /oauth/token endpoint, as described in RFC 6749, section 4.1.3.

Example 4.5. OAuthAuthorizeToken Object Definition

{
  "kind": "OAuthAuthorizeToken",
  "apiVersion": "v1",
  "metadata": {
    "name": "MDAwYjM5YjMtMzM1MC00NDY4LTkxODItOTA2OTE2YzE0M2Fj", 1
    "resourceVersion": "1",
    "creationTimestamp": "2015-01-01T01:01:01-00:00"
  },
  "clientName": "openshift-web-console", 2
  "expiresIn": 300, 3
  "scopes": [],
  "redirectURI": "https://localhost:8443/console/oauth", 4
  "userName": "bob", 5
  "userUID": "9311ac33-0fde-11e5-97a1-3c970e4b7ffe" 6
}
1
name represents the token name, used as an authorization code to exchange for an OAuthAccessToken.
2
The clientName value is the OAuthClient that requested this token.
3
The expiresIn value is the expiration in seconds from the creationTimestamp.
4
The redirectURI value is the location where the user was redirected to during the authorization flow that resulted in this token.
5
userName represents the name of the User this token allows obtaining an OAuthAccessToken for.
6
userUID represents the UID of the User this token allows obtaining an OAuthAccessToken for.

4.10.7.4. OAuthAccessToken

An OAuthAccessToken represents an OAuth access token, as described in RFC 6749, section 1.4.

An OAuthAccessToken is created by a request to the /oauth/token endpoint, as described in RFC 6749, section 4.1.3.

Access tokens are used as bearer tokens to authenticate to the API.

Example 4.6. OAuthAccessToken Object Definition

{
  "kind": "OAuthAccessToken",
  "apiVersion": "v1",
  "metadata": {
    "name": "ODliOGE5ZmMtYzczYi00Nzk1LTg4MGEtNzQyZmUxZmUwY2Vh", 1
    "resourceVersion": "1",
    "creationTimestamp": "2015-01-01T01:01:02-00:00"
  },
  "clientName": "openshift-web-console", 2
  "expiresIn": 86400, 3
  "scopes": [],
  "redirectURI": "https://localhost:8443/console/oauth", 4
  "userName": "bob", 5
  "userUID": "9311ac33-0fde-11e5-97a1-3c970e4b7ffe", 6
  "authorizeToken": "MDAwYjM5YjMtMzM1MC00NDY4LTkxODItOTA2OTE2YzE0M2Fj" 7
}
1
name is the token name, which is used as a bearer token to authenticate to the API.
2
The clientName value is the OAuthClient that requested this token.
3
The expiresIn value is the expiration in seconds from the creationTimestamp.
4
The redirectURI is where the user was redirected to during the authorization flow that resulted in this token.
5
userName represents the User this token allows authentication as.
6
userUID represents the User this token allows authentication as.
7
authorizeToken is the name of the OAuthAuthorizationToken used to obtain this token, if any.

4.10.8. User Objects

4.10.8.1. Identity

When a user logs into OpenShift, they do so using a configured identity provider. This determines the user’s identity, and provides that information to OpenShift.

OpenShift then looks for a UserIdentityMapping for that Identity:

  • If the Identity already exists, but is not mapped to a User, login fails.
  • If the Identity already exists, and is mapped to a User, the user is given an OAuthAccessToken for the mapped User.
  • If the Identity does not exist, an Identity, User, and UserIdentityMapping are created, and the user is given an OAuthAccessToken for the mapped User.

Example 4.7. Identity Object Definition

{
    "kind": "Identity",
    "apiVersion": "v1",
    "metadata": {
        "name": "anypassword:bob", 1
        "uid": "9316ebad-0fde-11e5-97a1-3c970e4b7ffe",
        "resourceVersion": "1",
        "creationTimestamp": "2015-01-01T01:01:01-00:00"
    },
    "providerName": "anypassword", 2
    "providerUserName": "bob", 3
    "user": {
        "name": "bob", 4
        "uid": "9311ac33-0fde-11e5-97a1-3c970e4b7ffe" 5
    }
}
1
The identity name must be in the form providerName:providerUserName.
2
providerName is the name of the identity provider.
3
providerUserName is the name that uniquely represents this identity in the scope of the identity provider.
4
The name in the user parameter is the name of the user this identity maps to.
5
The uid represents the UID of the user this identity maps to.

4.10.8.2. User

A User represents an actor in the system. Users are granted permissions by adding roles to users or to their groups.

User objects are created automatically on first login, or can be created via the API.

Example 4.8. User Object Definition

{
  "kind": "User",
  "apiVersion": "v1",
  "metadata": {
    "name": "bob", 1
    "uid": "9311ac33-0fde-11e5-97a1-3c970e4b7ffe",
    "resourceVersion": "1",
    "creationTimestamp": "2015-01-01T01:01:01-00:00"
  },
  "identities": [
    "anypassword:bob" 2
  ],
  "fullName": "Bob User" 3
}
1
name is the user name used when adding roles to a user.
2
The values in identities are Identity objects that map to this user. May be null or empty for users that cannot log in.
3
The fullName value is an optional display name of user.

4.10.8.3. UserIdentityMapping

A UserIdentityMapping maps an Identity to a User.

Creating, updating, or deleting a UserIdentityMapping modifies the corresponding fields in the Identity and User objects.

An Identity can only map to a single User, so logging in as a particular identity unambiguously determines the User.

A User can have multiple identities mapped to it. This allows multiple login methods to identify the same User.

Example 4.9. UserIdentityMapping Object Definition

{
    "kind": "UserIdentityMapping",
    "apiVersion": "v1",
    "metadata": {
        "name": "anypassword:bob", 1
        "uid": "9316ebad-0fde-11e5-97a1-3c970e4b7ffe",
        "resourceVersion": "1"
    },
    "identity": {
        "name": "anypassword:bob",
        "uid": "9316ebad-0fde-11e5-97a1-3c970e4b7ffe"
    },
    "user": {
        "name": "bob",
        "uid": "9311ac33-0fde-11e5-97a1-3c970e4b7ffe"
    }
}
1
UserIdentityMapping name matches the mapped Identity name

4.10.8.4. Group

A Group represents a list of users in the system. Groups are granted permissions by adding roles to users or to their groups.

Example 4.10. Group Object Definition

{
  "kind": "Group",
  "apiVersion": "v1",
  "metadata": {
    "name": "developers", 1
    "creationTimestamp": "2015-01-01T01:01:01-00:00"
  },
  "users": [
    "bob" 2
  ]
}
1
name is the group name used when adding roles to a group.
2
The values in users are the names of User objects that are members of this group.


[1] After this point, device names refer to devices on container B’s host.