Chapter 6. Working with containers

6.1. Understanding Containers

The basic units of OpenShift Container Platform applications are called containers. Linux container technologies are lightweight mechanisms for isolating running processes so that they are limited to interacting with only their designated resources.

Many application instances can be running in containers on a single host without visibility into each others' processes, files, network, and so on. Typically, each container provides a single service (often called a "micro-service"), such as a web server or a database, though containers can be used for arbitrary workloads.

The Linux kernel has been incorporating capabilities for container technologies for years. OpenShift Container Platform and Kubernetes add the ability to orchestrate containers across multi-host installations.

About containers and RHEL kernel memory

Due to Red Hat Enterprise Linux (RHEL) behavior, a container on a node with high CPU usage might seem to consume more memory than expected. The higher memory consumption could be caused by the kmem_cache in the RHEL kernel. The RHEL kernel creates a kmem_cache for each cgroup. For added performance, the kmem_cache contains a cpu_cache, and a node cache for any NUMA nodes. These caches all consume kernel memory.

The amount of memory stored in those caches is proportional to the number of CPUs that the system uses. As a result, a higher number of CPUs results in a greater amount of kernel memory being held in these caches. Higher amounts of kernel memory in these caches can cause OpenShift Container Platform containers to exceed the configured memory limits, resulting in the container being killed.

To avoid losing containers due to kernel memory issues, ensure that the containers request sufficient memory. You can use the following formula to estimate the amount of memory consumed by the kmem_cache, where nproc is the number of processing units available that are reported by the nproc command. The lower limit of container requests should be this value plus the container memory requirements:

$(nproc) X 1/2 MiB

6.2. Using Init Containers to perform tasks before a pod is deployed

OpenShift Container Platform provides init containers, which are specialized containers that run before application containers and can contain utilities or setup scripts not present in an app image.

6.2.1. Understanding Init Containers

You can use an Init Container resource to perform tasks before the rest of a pod is deployed.

A pod can have Init Containers in addition to application containers. Init containers allow you to reorganize setup scripts and binding code.

An Init Container can:

  • Contain and run utilities that are not desirable to include in the app Container image for security reasons.
  • Contain utilities or custom code for setup that is not present in an app image. For example, there is no requirement to make an image FROM another image just to use a tool like sed, awk, python, or dig during setup.
  • Use Linux namespaces so that they have different filesystem views from app containers, such as access to secrets that application containers are not able to access.

Each Init Container must complete successfully before the next one is started. So, Init Containers provide an easy way to block or delay the startup of app containers until some set of preconditions are met.

For example, the following are some ways you can use Init Containers:

  • Wait for a service to be created with a shell command like:

    for i in {1..100}; do sleep 1; if dig myservice; then exit 0; fi; done; exit 1
  • Register this pod with a remote server from the downward API with a command like:

    $ curl -X POST http://$MANAGEMENT_SERVICE_HOST:$MANAGEMENT_SERVICE_PORT/register -d ‘instance=$()&ip=$()’
  • Wait for some time before starting the app Container with a command like sleep 60.
  • Clone a git repository into a volume.
  • Place values into a configuration file and run a template tool to dynamically generate a configuration file for the main app Container. For example, place the POD_IP value in a configuration and generate the main app configuration file using Jinja.

See the Kubernetes documentation for more information.

6.2.2. Creating Init Containers

The following example outlines a simple pod which has two Init Containers. The first waits for myservice and the second waits for mydb. After both containers complete, the pod begins.

Procedure

  1. Create a YAML file for the Init Container:

    apiVersion: v1
    kind: Pod
    metadata:
      name: myapp-pod
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp-container
        image: registry.access.redhat.com/ubi8/ubi:latest
        command: ['sh', '-c', 'echo The app is running! && sleep 3600']
      initContainers:
      - name: init-myservice
        image: registry.access.redhat.com/ubi8/ubi:latest
        command: ['sh', '-c', 'until getent hosts myservice; do echo waiting for myservice; sleep 2; done;']
      - name: init-mydb
        image: registry.access.redhat.com/ubi8/ubi:latest
        command: ['sh', '-c', 'until getent hosts mydb; do echo waiting for mydb; sleep 2; done;']
  2. Create a YAML file for the myservice service.

    kind: Service
    apiVersion: v1
    metadata:
      name: myservice
    spec:
      ports:
      - protocol: TCP
        port: 80
        targetPort: 9376
  3. Create a YAML file for the mydb service.

    kind: Service
    apiVersion: v1
    metadata:
      name: mydb
    spec:
      ports:
      - protocol: TCP
        port: 80
        targetPort: 9377
  4. Run the following command to create the myapp-pod:

    $ oc create -f myapp.yaml

    Example output

    pod/myapp-pod created

  5. View the status of the pod:

    $ oc get pods

    Example output

    NAME                          READY     STATUS              RESTARTS   AGE
    myapp-pod                     0/1       Init:0/2            0          5s

    Note that the pod status indicates it is waiting

  6. Run the following commands to create the services:

    $ oc create -f mydb.yaml
    $ oc create -f myservice.yaml
  7. View the status of the pod:

    $ oc get pods

    Example output

    NAME                          READY     STATUS              RESTARTS   AGE
    myapp-pod                     1/1       Running             0          2m

6.3. Using volumes to persist container data

Files in a container are ephemeral. As such, when a container crashes or stops, the data is lost. You can use volumes to persist the data used by the containers in a pod. A volume is directory, accessible to the Containers in a pod, where data is stored for the life of the pod.

6.3.1. Understanding volumes

Volumes are mounted file systems available to pods and their containers which may be backed by a number of host-local or network attached storage endpoints. Containers are not persistent by default; on restart, their contents are cleared.

To ensure that the file system on the volume contains no errors and, if errors are present, to repair them when possible, OpenShift Container Platform invokes the fsck utility prior to the mount utility. This occurs when either adding a volume or updating an existing volume.

The simplest volume type is emptyDir, which is a temporary directory on a single machine. Administrators may also allow you to request a persistent volume that is automatically attached to your pods.

Note

emptyDir volume storage may be restricted by a quota based on the pod’s FSGroup, if the FSGroup parameter is enabled by your cluster administrator.

6.3.2. Working with volumes using the OpenShift Container Platform CLI

You can use the CLI command oc set volume to add and remove volumes and volume mounts for any object that has a pod template like replication controllers or deployment configs. You can also list volumes in pods or any object that has a pod template.

The oc set volume command uses the following general syntax:

$ oc set volume <object_selection> <operation> <mandatory_parameters> <options>
Object selection
Specify one of the following for the object_selection parameter in the oc set volume command:

Table 6.1. Object Selection

SyntaxDescriptionExample

<object_type> <name>

Selects <name> of type <object_type>.

deploymentConfig registry

<object_type>/<name>

Selects <name> of type <object_type>.

deploymentConfig/registry

<object_type>--selector=<object_label_selector>

Selects resources of type <object_type> that matched the given label selector.

deploymentConfig--selector="name=registry"

<object_type> --all

Selects all resources of type <object_type>.

deploymentConfig --all

-f or --filename=<file_name>

File name, directory, or URL to file to use to edit the resource.

-f registry-deployment-config.json

Operation
Specify --add or --remove for the operation parameter in the oc set volume command.
Mandatory parameters
Any mandatory parameters are specific to the selected operation and are discussed in later sections.
Options
Any options are specific to the selected operation and are discussed in later sections.

6.3.3. Listing volumes and volume mounts in a pod

You can list volumes and volume mounts in pods or pod templates:

Procedure

To list volumes:

$ oc set volume <object_type>/<name> [options]

List volume supported options:

OptionDescriptionDefault

--name

Name of the volume.

 

-c, --containers

Select containers by name. It can also take wildcard '*' that matches any character.

'*'

For example:

  • To list all volumes for pod p1:

    $ oc set volume pod/p1
  • To list volume v1 defined on all deployment configs:

    $ oc set volume dc --all --name=v1

6.3.4. Adding volumes to a pod

You can add volumes and volume mounts to a pod.

Procedure

To add a volume, a volume mount, or both to pod templates:

$ oc set volume <object_type>/<name> --add [options]

Table 6.2. Supported Options for Adding Volumes

OptionDescriptionDefault

--name

Name of the volume.

Automatically generated, if not specified.

-t, --type

Name of the volume source. Supported values: emptyDir, hostPath, secret, configmap, persistentVolumeClaim or projected.

emptyDir

-c, --containers

Select containers by name. It can also take wildcard '*' that matches any character.

'*'

-m, --mount-path

Mount path inside the selected containers. Do not mount to the container root, /, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host /dev/pts files. It is safe to mount the host by using /host.

 

--path

Host path. Mandatory parameter for --type=hostPath. Do not mount to the container root, /, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host /dev/pts files. It is safe to mount the host by using /host.

 

--secret-name

Name of the secret. Mandatory parameter for --type=secret.

 

--configmap-name

Name of the configmap. Mandatory parameter for --type=configmap.

 

--claim-name

Name of the persistent volume claim. Mandatory parameter for --type=persistentVolumeClaim.

 

--source

Details of volume source as a JSON string. Recommended if the desired volume source is not supported by --type.

 

-o, --output

Display the modified objects instead of updating them on the server. Supported values: json, yaml.

 

--output-version

Output the modified objects with the given version.

api-version

For example:

  • To add a new volume source emptyDir to the registry DeploymentConfig object:

    $ oc set volume dc/registry --add
    Tip

    You can alternatively apply the following YAML to add the volume:

    Example 6.1. Sample deployment config with an added volume

    kind: DeploymentConfig
    apiVersion: apps.openshift.io/v1
    metadata:
      name: registry
      namespace: registry
    spec:
      replicas: 3
      selector:
        app: httpd
      template:
        metadata:
          labels:
            app: httpd
        spec:
          volumes: 1
            - name: volume-pppsw
              emptyDir: {}
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
    1
    Add the volume source emptyDir.
  • To add volume v1 with secret secret1 for replication controller r1 and mount inside the containers at /data:

    $ oc set volume rc/r1 --add --name=v1 --type=secret --secret-name='secret1' --mount-path=/data
    Tip

    You can alternatively apply the following YAML to add the volume:

    Example 6.2. Sample replication controller with added volume and secret

    kind: ReplicationController
    apiVersion: v1
    metadata:
      name: example-1
      namespace: example
    spec:
      replicas: 0
      selector:
        app: httpd
        deployment: example-1
        deploymentconfig: example
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: httpd
            deployment: example-1
            deploymentconfig: example
        spec:
          volumes: 1
            - name: v1
              secret:
                secretName: secret1
                defaultMode: 420
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              volumeMounts: 2
                - name: v1
                  mountPath: /data
    1
    Add the volume and secret.
    2
    Add the container mount path.
  • To add existing persistent volume v1 with claim name pvc1 to deployment configuration dc.json on disk, mount the volume on container c1 at /data, and update the DeploymentConfig object on the server:

    $ oc set volume -f dc.json --add --name=v1 --type=persistentVolumeClaim \
      --claim-name=pvc1 --mount-path=/data --containers=c1
    Tip

    You can alternatively apply the following YAML to add the volume:

    Example 6.3. Sample deployment config with persistent volume added

    kind: DeploymentConfig
    apiVersion: apps.openshift.io/v1
    metadata:
      name: example
      namespace: example
    spec:
      replicas: 3
      selector:
        app: httpd
      template:
        metadata:
          labels:
            app: httpd
        spec:
          volumes:
            - name: volume-pppsw
              emptyDir: {}
            - name: v1 1
              persistentVolumeClaim:
                claimName: pvc1
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
              volumeMounts: 2
                - name: v1
                  mountPath: /data
    1
    Add the persistent volume claim named `pvc1.
    2
    Add the container mount path.
  • To add a volume v1 based on Git repository https://github.com/namespace1/project1 with revision 5125c45f9f563 for all replication controllers:

    $ oc set volume rc --all --add --name=v1 \
      --source='{"gitRepo": {
                    "repository": "https://github.com/namespace1/project1",
                    "revision": "5125c45f9f563"
                }}'

6.3.5. Updating volumes and volume mounts in a pod

You can modify the volumes and volume mounts in a pod.

Procedure

Updating existing volumes using the --overwrite option:

$ oc set volume <object_type>/<name> --add --overwrite [options]

For example:

  • To replace existing volume v1 for replication controller r1 with existing persistent volume claim pvc1:

    $ oc set volume rc/r1 --add --overwrite --name=v1 --type=persistentVolumeClaim --claim-name=pvc1
    Tip

    You can alternatively apply the following YAML to replace the volume:

    Example 6.4. Sample replication controller with persistent volume claim named pvc1

    kind: ReplicationController
    apiVersion: v1
    metadata:
      name: example-1
      namespace: example
    spec:
      replicas: 0
      selector:
        app: httpd
        deployment: example-1
        deploymentconfig: example
      template:
        metadata:
          labels:
            app: httpd
            deployment: example-1
            deploymentconfig: example
        spec:
          volumes:
            - name: v1 1
              persistentVolumeClaim:
                claimName: pvc1
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
              volumeMounts:
                - name: v1
                  mountPath: /data
    1
    Set persistent volume claim to pvc1.
  • To change the DeploymentConfig object d1 mount point to /opt for volume v1:

    $ oc set volume dc/d1 --add --overwrite --name=v1 --mount-path=/opt
    Tip

    You can alternatively apply the following YAML to change the mount point:

    Example 6.5. Sample deployment config with mount point set to opt.

    kind: DeploymentConfig
    apiVersion: apps.openshift.io/v1
    metadata:
      name: example
      namespace: example
    spec:
      replicas: 3
      selector:
        app: httpd
      template:
        metadata:
          labels:
            app: httpd
        spec:
          volumes:
            - name: volume-pppsw
              emptyDir: {}
            - name: v2
              persistentVolumeClaim:
                claimName: pvc1
            - name: v1
              persistentVolumeClaim:
                claimName: pvc1
          containers:
            - name: httpd
              image: >-
                image-registry.openshift-image-registry.svc:5000/openshift/httpd:latest
              ports:
                - containerPort: 8080
                  protocol: TCP
              volumeMounts: 1
                - name: v1
                  mountPath: /opt
    1
    Set the mount point to /opt.

6.3.6. Removing volumes and volume mounts from a pod

You can remove a volume or volume mount from a pod.

Procedure

To remove a volume from pod templates:

$ oc set volume <object_type>/<name> --remove [options]

Table 6.3. Supported options for removing volumes

OptionDescriptionDefault

--name

Name of the volume.

 

-c, --containers

Select containers by name. It can also take wildcard '*' that matches any character.

'*'

--confirm

Indicate that you want to remove multiple volumes at once.

 

-o, --output

Display the modified objects instead of updating them on the server. Supported values: json, yaml.

 

--output-version

Output the modified objects with the given version.

api-version

For example:

  • To remove a volume v1 from the DeploymentConfig object d1:

    $ oc set volume dc/d1 --remove --name=v1
  • To unmount volume v1 from container c1 for the DeploymentConfig object d1 and remove the volume v1 if it is not referenced by any containers on d1:

    $ oc set volume dc/d1 --remove --name=v1 --containers=c1
  • To remove all volumes for replication controller r1:

    $ oc set volume rc/r1 --remove --confirm

6.3.7. Configuring volumes for multiple uses in a pod

You can configure a volume to allows you to share one volume for multiple uses in a single pod using the volumeMounts.subPath property to specify a subPath value inside a volume instead of the volume’s root.

Procedure

  1. View the list of files in the volume, run the oc rsh command:

    $ oc rsh <pod>

    Example output

    sh-4.2$ ls /path/to/volume/subpath/mount
    example_file1 example_file2 example_file3

  2. Specify the subPath:

    Example Pod spec with subPath parameter

    apiVersion: v1
    kind: Pod
    metadata:
      name: my-site
    spec:
        containers:
        - name: mysql
          image: mysql
          volumeMounts:
          - mountPath: /var/lib/mysql
            name: site-data
            subPath: mysql 1
        - name: php
          image: php
          volumeMounts:
          - mountPath: /var/www/html
            name: site-data
            subPath: html 2
        volumes:
        - name: site-data
          persistentVolumeClaim:
            claimName: my-site-data

    1
    Databases are stored in the mysql folder.
    2
    HTML content is stored in the html folder.

6.4. Mapping volumes using projected volumes

A projected volume maps several existing volume sources into the same directory.

The following types of volume sources can be projected:

  • Secrets
  • Config Maps
  • Downward API
Note

All sources are required to be in the same namespace as the pod.

6.4.1. Understanding projected volumes

Projected volumes can map any combination of these volume sources into a single directory, allowing the user to:

  • automatically populate a single volume with the keys from multiple secrets, config maps, and with downward API information, so that I can synthesize a single directory with various sources of information;
  • populate a single volume with the keys from multiple secrets, config maps, and with downward API information, explicitly specifying paths for each item, so that I can have full control over the contents of that volume.
Important

When the RunAsUser permission is set in the security context of a Linux-based pod, the projected files have the correct permissions set, including container user ownership. However, when the Windows equivalent RunAsUsername permission is set in a Windows pod, the kubelet is unable to correctly set ownership on the files in the projected volume.

Therefore, the RunAsUsername permission set in the security context of a Windows pod is not honored for Windows projected volumes running in OpenShift Container Platform.

The following general scenarios show how you can use projected volumes.

Config map, secrets, Downward API.
Projected volumes allow you to deploy containers with configuration data that includes passwords. An application using these resources could be deploying Red Hat OpenStack Platform (RHOSP) on Kubernetes. The configuration data might have to be assembled differently depending on if the services are going to be used for production or for testing. If a pod is labeled with production or testing, the downward API selector metadata.labels can be used to produce the correct RHOSP configs.
Config map + secrets.
Projected volumes allow you to deploy containers involving configuration data and passwords. For example, you might execute a config map with some sensitive encrypted tasks that are decrypted using a vault password file.
ConfigMap + Downward API.
Projected volumes allow you to generate a config including the pod name (available via the metadata.name selector). This application can then pass the pod name along with requests to easily determine the source without using IP tracking.
Secrets + Downward API.
Projected volumes allow you to use a secret as a public key to encrypt the namespace of the pod (available via the metadata.namespace selector). This example allows the Operator to use the application to deliver the namespace information securely without using an encrypted transport.

6.4.1.1. Example Pod specs

The following are examples of Pod specs for creating projected volumes.

Pod with a secret, a Downward API, and a config map

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts: 1
    - name: all-in-one
      mountPath: "/projected-volume"2
      readOnly: true 3
  volumes: 4
  - name: all-in-one 5
    projected:
      defaultMode: 0400 6
      sources:
      - secret:
          name: mysecret 7
          items:
            - key: username
              path: my-group/my-username 8
      - downwardAPI: 9
          items:
            - path: "labels"
              fieldRef:
                fieldPath: metadata.labels
            - path: "cpu_limit"
              resourceFieldRef:
                containerName: container-test
                resource: limits.cpu
      - configMap: 10
          name: myconfigmap
          items:
            - key: config
              path: my-group/my-config
              mode: 0777 11

1
Add a volumeMounts section for each container that needs the secret.
2
Specify a path to an unused directory where the secret will appear.
3
Set readOnly to true.
4
Add a volumes block to list each projected volume source.
5
Specify any name for the volume.
6
Set the execute permission on the files.
7
Add a secret. Enter the name of the secret object. Each secret you want to use must be listed.
8
Specify the path to the secrets file under the mountPath. Here, the secrets file is in /projected-volume/my-group/my-username.
9
Add a Downward API source.
10
Add a ConfigMap source.
11
Set the mode for the specific projection
Note

If there are multiple containers in the pod, each container needs a volumeMounts section, but only one volumes section is needed.

Pod with multiple secrets with a non-default permission mode set

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      defaultMode: 0755
      sources:
      - secret:
          name: mysecret
          items:
            - key: username
              path: my-group/my-username
      - secret:
          name: mysecret2
          items:
            - key: password
              path: my-group/my-password
              mode: 511

Note

The defaultMode can only be specified at the projected level and not for each volume source. However, as illustrated above, you can explicitly set the mode for each individual projection.

6.4.1.2. Pathing Considerations

Collisions Between Keys when Configured Paths are Identical

If you configure any keys with the same path, the pod spec will not be accepted as valid. In the following example, the specified path for mysecret and myconfigmap are the same:

apiVersion: v1
kind: Pod
metadata:
  name: volume-test
spec:
  containers:
  - name: container-test
    image: busybox
    volumeMounts:
    - name: all-in-one
      mountPath: "/projected-volume"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      sources:
      - secret:
          name: mysecret
          items:
            - key: username
              path: my-group/data
      - configMap:
          name: myconfigmap
          items:
            - key: config
              path: my-group/data

Consider the following situations related to the volume file paths.

Collisions Between Keys without Configured Paths
The only run-time validation that can occur is when all the paths are known at pod creation, similar to the above scenario. Otherwise, when a conflict occurs the most recent specified resource will overwrite anything preceding it (this is true for resources that are updated after pod creation as well).
Collisions when One Path is Explicit and the Other is Automatically Projected
In the event that there is a collision due to a user specified path matching data that is automatically projected, the latter resource will overwrite anything preceding it as before

6.4.2. Configuring a Projected Volume for a Pod

When creating projected volumes, consider the volume file path situations described in Understanding projected volumes.

The following example shows how to use a projected volume to mount an existing secret volume source. The steps can be used to create a user name and password secrets from local files. You then create a pod that runs one container, using a projected volume to mount the secrets into the same shared directory.

Procedure

To use a projected volume to mount an existing secret volume source.

  1. Create files containing the secrets, entering the following, replacing the password and user information as appropriate:

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    type: Opaque
    data:
      pass: MWYyZDFlMmU2N2Rm
      user: YWRtaW4=

    The user and pass values can be any valid string that is base64 encoded.

    The following example shows admin in base64:

    $ echo -n "admin" | base64

    Example output

    YWRtaW4=

    The following example shows the password 1f2d1e2e67df in base64:.

    $ echo -n "1f2d1e2e67df" | base64

    Example output

    MWYyZDFlMmU2N2Rm

  2. Use the following command to create the secrets:

    $ oc create -f <secrets-filename>

    For example:

    $ oc create -f secret.yaml

    Example output

    secret "mysecret" created

  3. You can check that the secret was created using the following commands:

    $ oc get secret <secret-name>

    For example:

    $ oc get secret mysecret

    Example output

    NAME       TYPE      DATA      AGE
    mysecret   Opaque    2         17h

    $ oc get secret <secret-name> -o yaml

    For example:

    $ oc get secret mysecret -o yaml
    apiVersion: v1
    data:
      pass: MWYyZDFlMmU2N2Rm
      user: YWRtaW4=
    kind: Secret
    metadata:
      creationTimestamp: 2017-05-30T20:21:38Z
      name: mysecret
      namespace: default
      resourceVersion: "2107"
      selfLink: /api/v1/namespaces/default/secrets/mysecret
      uid: 959e0424-4575-11e7-9f97-fa163e4bd54c
    type: Opaque
  4. Create a pod configuration file similar to the following that includes a volumes section:

    apiVersion: v1
    kind: Pod
    metadata:
      name: test-projected-volume
    spec:
      containers:
      - name: test-projected-volume
        image: busybox
        args:
        - sleep
        - "86400"
        volumeMounts:
        - name: all-in-one
          mountPath: "/projected-volume"
          readOnly: true
      volumes:
      - name: all-in-one
        projected:
          sources:
          - secret:      1
              name: user
          - secret:      2
              name: pass
    1 2
    The name of the secret you created.
  5. Create the pod from the configuration file:

    $ oc create -f <your_yaml_file>.yaml

    For example:

    $ oc create -f secret-pod.yaml

    Example output

    pod "test-projected-volume" created

  6. Verify that the pod container is running, and then watch for changes to the pod:

    $ oc get pod <name>

    For example:

    $ oc get pod test-projected-volume

    The output should appear similar to the following:

    Example output

    NAME                    READY     STATUS    RESTARTS   AGE
    test-projected-volume   1/1       Running   0          14s

  7. In another terminal, use the oc exec command to open a shell to the running container:

    $ oc exec -it <pod> <command>

    For example:

    $ oc exec -it test-projected-volume -- /bin/sh
  8. In your shell, verify that the projected-volumes directory contains your projected sources:

    / # ls

    Example output

    bin               home              root              tmp
    dev               proc              run               usr
    etc               projected-volume  sys               var

6.5. Allowing containers to consume API objects

The Downward API is a mechanism that allows containers to consume information about API objects without coupling to OpenShift Container Platform. Such information includes the pod’s name, namespace, and resource values. Containers can consume information from the downward API using environment variables or a volume plugin.

6.5.1. Expose pod information to Containers using the Downward API

The Downward API contains such information as the pod’s name, project, and resource values. Containers can consume information from the downward API using environment variables or a volume plugin.

Fields within the pod are selected using the FieldRef API type. FieldRef has two fields:

FieldDescription

fieldPath

The path of the field to select, relative to the pod.

apiVersion

The API version to interpret the fieldPath selector within.

Currently, the valid selectors in the v1 API include:

SelectorDescription

metadata.name

The pod’s name. This is supported in both environment variables and volumes.

metadata.namespace

The pod’s namespace.This is supported in both environment variables and volumes.

metadata.labels

The pod’s labels. This is only supported in volumes and not in environment variables.

metadata.annotations

The pod’s annotations. This is only supported in volumes and not in environment variables.

status.podIP

The pod’s IP. This is only supported in environment variables and not volumes.

The apiVersion field, if not specified, defaults to the API version of the enclosing pod template.

6.5.2. Understanding how to consume container values using the downward API

You containers can consume API values using environment variables or a volume plugin. Depending on the method you choose, containers can consume:

  • Pod name
  • Pod project/namespace
  • Pod annotations
  • Pod labels

Annotations and labels are available using only a volume plugin.

6.5.2.1. Consuming container values using environment variables

When using a container’s environment variables, use the EnvVar type’s valueFrom field (of type EnvVarSource) to specify that the variable’s value should come from a FieldRef source instead of the literal value specified by the value field.

Only constant attributes of the pod can be consumed this way, as environment variables cannot be updated once a process is started in a way that allows the process to be notified that the value of a variable has changed. The fields supported using environment variables are:

  • Pod name
  • Pod project/namespace

Procedure

To use environment variables

  1. Create a pod.yaml file:

    apiVersion: v1
    kind: Pod
    metadata:
      name: dapi-env-test-pod
    spec:
      containers:
        - name: env-test-container
          image: gcr.io/google_containers/busybox
          command: [ "/bin/sh", "-c", "env" ]
          env:
            - name: MY_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: MY_POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
      restartPolicy: Never
  2. Create the pod from the pod.yaml file:

    $ oc create -f pod.yaml
  3. Check the container’s logs for the MY_POD_NAME and MY_POD_NAMESPACE values:

    $ oc logs -p dapi-env-test-pod

6.5.2.2. Consuming container values using a volume plugin

You containers can consume API values using a volume plugin.

Containers can consume:

  • Pod name
  • Pod project/namespace
  • Pod annotations
  • Pod labels

Procedure

To use the volume plugin:

  1. Create a volume-pod.yaml file:

    kind: Pod
    apiVersion: v1
    metadata:
      labels:
        zone: us-east-coast
        cluster: downward-api-test-cluster1
        rack: rack-123
      name: dapi-volume-test-pod
      annotations:
        annotation1: "345"
        annotation2: "456"
    spec:
      containers:
        - name: volume-test-container
          image: gcr.io/google_containers/busybox
          command: ["sh", "-c", "cat /tmp/etc/pod_labels /tmp/etc/pod_annotations"]
          volumeMounts:
            - name: podinfo
              mountPath: /tmp/etc
              readOnly: false
      volumes:
      - name: podinfo
        downwardAPI:
          defaultMode: 420
          items:
          - fieldRef:
              fieldPath: metadata.name
            path: pod_name
          - fieldRef:
              fieldPath: metadata.namespace
            path: pod_namespace
          - fieldRef:
              fieldPath: metadata.labels
            path: pod_labels
          - fieldRef:
              fieldPath: metadata.annotations
            path: pod_annotations
      restartPolicy: Never
  2. Create the pod from the volume-pod.yaml file:

    $ oc create -f volume-pod.yaml
  3. Check the container’s logs and verify the presence of the configured fields:

    $ oc logs -p dapi-volume-test-pod

    Example output

    cluster=downward-api-test-cluster1
    rack=rack-123
    zone=us-east-coast
    annotation1=345
    annotation2=456
    kubernetes.io/config.source=api

6.5.3. Understanding how to consume container resources using the Downward API

When creating pods, you can use the Downward API to inject information about computing resource requests and limits so that image and application authors can correctly create an image for specific environments.

You can do this using environment variable or a volume plugin.

6.5.3.1. Consuming container resources using environment variables

When creating pods, you can use the Downward API to inject information about computing resource requests and limits using environment variables.

Procedure

To use environment variables:

  1. When creating a pod configuration, specify environment variables that correspond to the contents of the resources field in the spec.container field:

    ....
    spec:
      containers:
        - name: test-container
          image: gcr.io/google_containers/busybox:1.24
          command: [ "/bin/sh", "-c", "env" ]
          resources:
            requests:
              memory: "32Mi"
              cpu: "125m"
            limits:
              memory: "64Mi"
              cpu: "250m"
          env:
            - name: MY_CPU_REQUEST
              valueFrom:
                resourceFieldRef:
                  resource: requests.cpu
            - name: MY_CPU_LIMIT
              valueFrom:
                resourceFieldRef:
                  resource: limits.cpu
            - name: MY_MEM_REQUEST
              valueFrom:
                resourceFieldRef:
                  resource: requests.memory
            - name: MY_MEM_LIMIT
              valueFrom:
                resourceFieldRef:
                  resource: limits.memory
    ....

    If the resource limits are not included in the container configuration, the downward API defaults to the node’s CPU and memory allocatable values.

  2. Create the pod from the pod.yaml file:

    $ oc create -f pod.yaml

6.5.3.2. Consuming container resources using a volume plugin

When creating pods, you can use the Downward API to inject information about computing resource requests and limits using a volume plugin.

Procedure

To use the Volume Plugin:

  1. When creating a pod configuration, use the spec.volumes.downwardAPI.items field to describe the desired resources that correspond to the spec.resources field:

    ....
    spec:
      containers:
        - name: client-container
          image: gcr.io/google_containers/busybox:1.24
          command: ["sh", "-c", "while true; do echo; if [[ -e /etc/cpu_limit ]]; then cat /etc/cpu_limit; fi; if [[ -e /etc/cpu_request ]]; then cat /etc/cpu_request; fi; if [[ -e /etc/mem_limit ]]; then cat /etc/mem_limit; fi; if [[ -e /etc/mem_request ]]; then cat /etc/mem_request; fi; sleep 5; done"]
          resources:
            requests:
              memory: "32Mi"
              cpu: "125m"
            limits:
              memory: "64Mi"
              cpu: "250m"
          volumeMounts:
            - name: podinfo
              mountPath: /etc
              readOnly: false
      volumes:
        - name: podinfo
          downwardAPI:
            items:
              - path: "cpu_limit"
                resourceFieldRef:
                  containerName: client-container
                  resource: limits.cpu
              - path: "cpu_request"
                resourceFieldRef:
                  containerName: client-container
                  resource: requests.cpu
              - path: "mem_limit"
                resourceFieldRef:
                  containerName: client-container
                  resource: limits.memory
              - path: "mem_request"
                resourceFieldRef:
                  containerName: client-container
                  resource: requests.memory
    ....

    If the resource limits are not included in the container configuration, the Downward API defaults to the node’s CPU and memory allocatable values.

  2. Create the pod from the volume-pod.yaml file:

    $ oc create -f volume-pod.yaml

6.5.4. Consuming secrets using the Downward API

When creating pods, you can use the downward API to inject secrets so image and application authors can create an image for specific environments.

Procedure

  1. Create a secret.yaml file:

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    data:
      password: cGFzc3dvcmQ=
      username: ZGV2ZWxvcGVy
    type: kubernetes.io/basic-auth
  2. Create a Secret object from the secret.yaml file:

    $ oc create -f secret.yaml
  3. Create a pod.yaml file that references the username field from the above Secret object:

    apiVersion: v1
    kind: Pod
    metadata:
      name: dapi-env-test-pod
    spec:
      containers:
        - name: env-test-container
          image: gcr.io/google_containers/busybox
          command: [ "/bin/sh", "-c", "env" ]
          env:
            - name: MY_SECRET_USERNAME
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: username
      restartPolicy: Never
  4. Create the pod from the pod.yaml file:

    $ oc create -f pod.yaml
  5. Check the container’s logs for the MY_SECRET_USERNAME value:

    $ oc logs -p dapi-env-test-pod

6.5.5. Consuming configuration maps using the Downward API

When creating pods, you can use the Downward API to inject configuration map values so image and application authors can create an image for specific environments.

Procedure

  1. Create a configmap.yaml file:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: myconfigmap
    data:
      mykey: myvalue
  2. Create a ConfigMap object from the configmap.yaml file:

    $ oc create -f configmap.yaml
  3. Create a pod.yaml file that references the above ConfigMap object:

    apiVersion: v1
    kind: Pod
    metadata:
      name: dapi-env-test-pod
    spec:
      containers:
        - name: env-test-container
          image: gcr.io/google_containers/busybox
          command: [ "/bin/sh", "-c", "env" ]
          env:
            - name: MY_CONFIGMAP_VALUE
              valueFrom:
                configMapKeyRef:
                  name: myconfigmap
                  key: mykey
      restartPolicy: Always
  4. Create the pod from the pod.yaml file:

    $ oc create -f pod.yaml
  5. Check the container’s logs for the MY_CONFIGMAP_VALUE value:

    $ oc logs -p dapi-env-test-pod

6.5.6. Referencing environment variables

When creating pods, you can reference the value of a previously defined environment variable by using the $() syntax. If the environment variable reference can not be resolved, the value will be left as the provided string.

Procedure

  1. Create a pod.yaml file that references an existing environment variable:

    apiVersion: v1
    kind: Pod
    metadata:
      name: dapi-env-test-pod
    spec:
      containers:
        - name: env-test-container
          image: gcr.io/google_containers/busybox
          command: [ "/bin/sh", "-c", "env" ]
          env:
            - name: MY_EXISTING_ENV
              value: my_value
            - name: MY_ENV_VAR_REF_ENV
              value: $(MY_EXISTING_ENV)
      restartPolicy: Never
  2. Create the pod from the pod.yaml file:

    $ oc create -f pod.yaml
  3. Check the container’s logs for the MY_ENV_VAR_REF_ENV value:

    $ oc logs -p dapi-env-test-pod

6.5.7. Escaping environment variable references

When creating a pod, you can escape an environment variable reference by using a double dollar sign. The value will then be set to a single dollar sign version of the provided value.

Procedure

  1. Create a pod.yaml file that references an existing environment variable:

    apiVersion: v1
    kind: Pod
    metadata:
      name: dapi-env-test-pod
    spec:
      containers:
        - name: env-test-container
          image: gcr.io/google_containers/busybox
          command: [ "/bin/sh", "-c", "env" ]
          env:
            - name: MY_NEW_ENV
              value: $$(SOME_OTHER_ENV)
      restartPolicy: Never
  2. Create the pod from the pod.yaml file:

    $ oc create -f pod.yaml
  3. Check the container’s logs for the MY_NEW_ENV value:

    $ oc logs -p dapi-env-test-pod

6.6. Copying files to or from an OpenShift Container Platform container

You can use the CLI to copy local files to or from a remote directory in a container using the rsync command.

6.6.1. Understanding how to copy files

The oc rsync command, or remote sync, is a useful tool for copying database archives to and from your pods for backup and restore purposes. You can also use oc rsync to copy source code changes into a running pod for development debugging, when the running pod supports hot reload of source files.

$ oc rsync <source> <destination> [-c <container>]

6.6.1.1. Requirements

Specifying the Copy Source
The source argument of the oc rsync command must point to either a local directory or a pod directory. Individual files are not supported.

When specifying a pod directory the directory name must be prefixed with the pod name:

<pod name>:<dir>

If the directory name ends in a path separator (/), only the contents of the directory are copied to the destination. Otherwise, the directory and its contents are copied to the destination.

Specifying the Copy Destination
The destination argument of the oc rsync command must point to a directory. If the directory does not exist, but rsync is used for copy, the directory is created for you.
Deleting Files at the Destination
The --delete flag may be used to delete any files in the remote directory that are not in the local directory.
Continuous Syncing on File Change
Using the --watch option causes the command to monitor the source path for any file system changes, and synchronizes changes when they occur. With this argument, the command runs forever.

Synchronization occurs after short quiet periods to ensure a rapidly changing file system does not result in continuous synchronization calls.

When using the --watch option, the behavior is effectively the same as manually invoking oc rsync repeatedly, including any arguments normally passed to oc rsync. Therefore, you can control the behavior via the same flags used with manual invocations of oc rsync, such as --delete.

6.6.2. Copying files to and from containers

Support for copying local files to or from a container is built into the CLI.

Prerequisites

When working with oc rsync, note the following:

rsync must be installed
The oc rsync command uses the local rsync tool if present on the client machine and the remote container.

If rsync is not found locally or in the remote container, a tar archive is created locally and sent to the container where the tar utility is used to extract the files. If tar is not available in the remote container, the copy will fail.

The tar copy method does not provide the same functionality as oc rsync. For example, oc rsync creates the destination directory if it does not exist and only sends files that are different between the source and the destination.

Note

In Windows, the cwRsync client should be installed and added to the PATH for use with the oc rsync command.

Procedure

  • To copy a local directory to a pod directory:

    $ oc rsync <local-dir> <pod-name>:/<remote-dir> -c <container-name>

    For example:

    $ oc rsync /home/user/source devpod1234:/src -c user-container
  • To copy a pod directory to a local directory:

    $ oc rsync devpod1234:/src /home/user/source

    Example output

    $ oc rsync devpod1234:/src/status.txt /home/user/

6.6.3. Using advanced Rsync features

The oc rsync command exposes fewer command line options than standard rsync. In the case that you want to use a standard rsync command line option that is not available in oc rsync, for example the --exclude-from=FILE option, it might be possible to use standard rsync 's --rsh (-e) option or RSYNC_RSH environment variable as a workaround, as follows:

$ rsync --rsh='oc rsh' --exclude-from=FILE SRC POD:DEST

or:

Export the RSYNC_RSH variable:

$ export RSYNC_RSH='oc rsh'

Then, run the rsync command:

$ rsync --exclude-from=FILE SRC POD:DEST

Both of the above examples configure standard rsync to use oc rsh as its remote shell program to enable it to connect to the remote pod, and are an alternative to running oc rsync.

6.7. Executing remote commands in an OpenShift Container Platform container

You can use the CLI to execute remote commands in an OpenShift Container Platform container.

6.7.1. Executing remote commands in containers

Support for remote container command execution is built into the CLI.

Procedure

To run a command in a container:

$ oc exec <pod> [-c <container>] <command> [<arg_1> ... <arg_n>]

For example:

$ oc exec mypod date

Example output

Thu Apr  9 02:21:53 UTC 2015

Important

For security purposes, the oc exec command does not work when accessing privileged containers except when the command is executed by a cluster-admin user.

6.7.2. Protocol for initiating a remote command from a client

Clients initiate the execution of a remote command in a container by issuing a request to the Kubernetes API server:

/proxy/nodes/<node_name>/exec/<namespace>/<pod>/<container>?command=<command>

In the above URL:

  • <node_name> is the FQDN of the node.
  • <namespace> is the project of the target pod.
  • <pod> is the name of the target pod.
  • <container> is the name of the target container.
  • <command> is the desired command to be executed.

For example:

/proxy/nodes/node123.openshift.com/exec/myns/mypod/mycontainer?command=date

Additionally, the client can add parameters to the request to indicate if:

  • the client should send input to the remote container’s command (stdin).
  • the client’s terminal is a TTY.
  • the remote container’s command should send output from stdout to the client.
  • the remote container’s command should send output from stderr to the client.

After sending an exec request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses HTTP/2.

The client creates one stream each for stdin, stdout, and stderr. To distinguish among the streams, the client sets the streamType header on the stream to one of stdin, stdout, or stderr.

The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the remote command execution request.

6.8. Using port forwarding to access applications in a container

OpenShift Container Platform supports port forwarding to pods.

6.8.1. Understanding port forwarding

You can use the CLI to forward one or more local ports to a pod. This allows you to listen on a given or random port locally, and have data forwarded to and from given ports in the pod.

Support for port forwarding is built into the CLI:

$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]

The CLI listens on each local port specified by the user, forwarding using the protocol described below.

Ports may be specified using the following formats:

5000

The client listens on port 5000 locally and forwards to 5000 in the pod.

6000:5000

The client listens on port 6000 locally and forwards to 5000 in the pod.

:5000 or 0:5000

The client selects a free local port and forwards to 5000 in the pod.

OpenShift Container Platform handles port-forward requests from clients. Upon receiving a request, OpenShift Container Platform upgrades the response and waits for the client to create port-forwarding streams. When OpenShift Container Platform receives a new stream, it copies data between the stream and the pod’s port.

Architecturally, there are options for forwarding to a pod’s port. The supported OpenShift Container Platform implementation invokes nsenter directly on the node host to enter the pod’s network namespace, then invokes socat to copy data between the stream and the pod’s port. However, a custom implementation could include running a helper pod that then runs nsenter and socat, so that those binaries are not required to be installed on the host.

6.8.2. Using port forwarding

You can use the CLI to port-forward one or more local ports to a pod.

Procedure

Use the following command to listen on the specified port in a pod:

$ oc port-forward <pod> [<local_port>:]<remote_port> [...[<local_port_n>:]<remote_port_n>]

For example:

  • Use the following command to listen on ports 5000 and 6000 locally and forward data to and from ports 5000 and 6000 in the pod:

    $ oc port-forward <pod> 5000 6000

    Example output

    Forwarding from 127.0.0.1:5000 -> 5000
    Forwarding from [::1]:5000 -> 5000
    Forwarding from 127.0.0.1:6000 -> 6000
    Forwarding from [::1]:6000 -> 6000

  • Use the following command to listen on port 8888 locally and forward to 5000 in the pod:

    $ oc port-forward <pod> 8888:5000

    Example output

    Forwarding from 127.0.0.1:8888 -> 5000
    Forwarding from [::1]:8888 -> 5000

  • Use the following command to listen on a free port locally and forward to 5000 in the pod:

    $ oc port-forward <pod> :5000

    Example output

    Forwarding from 127.0.0.1:42390 -> 5000
    Forwarding from [::1]:42390 -> 5000

    Or:

    $ oc port-forward <pod> 0:5000

6.8.3. Protocol for initiating port forwarding from a client

Clients initiate port forwarding to a pod by issuing a request to the Kubernetes API server:

/proxy/nodes/<node_name>/portForward/<namespace>/<pod>

In the above URL:

  • <node_name> is the FQDN of the node.
  • <namespace> is the namespace of the target pod.
  • <pod> is the name of the target pod.

For example:

/proxy/nodes/node123.openshift.com/portForward/myns/mypod

After sending a port forward request to the API server, the client upgrades the connection to one that supports multiplexed streams; the current implementation uses Hyptertext Transfer Protocol Version 2 (HTTP/2).

The client creates a stream with the port header containing the target port in the pod. All data written to the stream is delivered via the kubelet to the target pod and port. Similarly, all data sent from the pod for that forwarded connection is delivered back to the same stream in the client.

The client closes all streams, the upgraded connection, and the underlying connection when it is finished with the port forwarding request.

6.9. Using sysctls in containers

Sysctl settings are exposed via Kubernetes, allowing users to modify certain kernel parameters at runtime for namespaces within a container. Only sysctls that are namespaced can be set independently on pods. If a sysctl is not namespaced, called node-level, you must use another method of setting the sysctl, such as the Node Tuning Operator. Moreover, only those sysctls considered safe are whitelisted by default; you can manually enable other unsafe sysctls on the node to be available to the user.

6.9.1. About sysctls

In Linux, the sysctl interface allows an administrator to modify kernel parameters at runtime. Parameters are available via the /proc/sys/ virtual process file system. The parameters cover various subsystems, such as:

  • kernel (common prefix: kernel.)
  • networking (common prefix: net.)
  • virtual memory (common prefix: vm.)
  • MDADM (common prefix: dev.)

More subsystems are described in Kernel documentation. To get a list of all parameters, run:

$ sudo sysctl -a

6.9.1.1. Namespaced versus node-level sysctls

A number of sysctls are namespaced in the Linux kernels. This means that you can set them independently for each pod on a node. Being namespaced is a requirement for sysctls to be accessible in a pod context within Kubernetes.

The following sysctls are known to be namespaced:

  • kernel.shm*
  • kernel.msg*
  • kernel.sem
  • fs.mqueue.*

Additionally, most of the sysctls in the net.* group are known to be namespaced. Their namespace adoption differs based on the kernel version and distributor.

Sysctls that are not namespaced are called node-level and must be set manually by the cluster administrator, either by means of the underlying Linux distribution of the nodes, such as by modifying the /etc/sysctls.conf file, or by using a daemon set with privileged containers. You can use the Node Tuning Operator to set node-level sysctls.

Note

Consider marking nodes with special sysctls as tainted. Only schedule pods onto them that need those sysctl settings. Use the taints and toleration feature to mark the nodes.

6.9.1.2. Safe versus unsafe sysctls

Sysctls are grouped into safe and unsafe sysctls.

For a sysctl to be considered safe, it must use proper namespacing and must be properly isolated between pods on the same node. This means that if you set a sysctl for one pod it must not:

  • Influence any other pod on the node
  • Harm the node’s health
  • Gain CPU or memory resources outside of the resource limits of a pod

OpenShift Container Platform supports, or whitelists, the following sysctls in the safe set:

  • kernel.shm_rmid_forced
  • net.ipv4.ip_local_port_range
  • net.ipv4.tcp_syncookies
  • net.ipv4.ping_group_range

All safe sysctls are enabled by default. You can use a sysctl in a pod by modifying the Pod spec.

Any sysctl not whitelisted by OpenShift Container Platform is considered unsafe for OpenShift Container Platform. Note that being namespaced alone is not sufficient for the sysctl to be considered safe.

All unsafe sysctls are disabled by default, and the cluster administrator must manually enable them on a per-node basis. Pods with disabled unsafe sysctls are scheduled but do not launch.

$ oc get pod

Example output

NAME        READY   STATUS            RESTARTS   AGE
hello-pod   0/1     SysctlForbidden   0          14s

6.9.2. Setting sysctls for a pod

You can set sysctls on pods using the pod’s securityContext. The securityContext applies to all containers in the same pod.

Safe sysctls are allowed by default. A pod with unsafe sysctls fails to launch on any node unless the cluster administrator explicitly enables unsafe sysctls for that node. As with node-level sysctls, use the taints and toleration feature or labels on nodes to schedule those pods onto the right nodes.

The following example uses the pod securityContext to set a safe sysctl kernel.shm_rmid_forced and two unsafe sysctls, net.core.somaxconn and kernel.msgmax. There is no distinction between safe and unsafe sysctls in the specification.

Warning

To avoid destabilizing your operating system, modify sysctl parameters only after you understand their effects.

Procedure

To use safe and unsafe sysctls:

  1. Modify the YAML file that defines the pod and add the securityContext spec, as shown in the following example:

    apiVersion: v1
    kind: Pod
    metadata:
      name: sysctl-example
    spec:
      securityContext:
        sysctls:
        - name: kernel.shm_rmid_forced
          value: "0"
        - name: net.core.somaxconn
          value: "1024"
        - name: kernel.msgmax
          value: "65536"
      ...
  2. Create the pod:

    $ oc apply -f <file-name>.yaml

    If the unsafe sysctls are not allowed for the node, the pod is scheduled, but does not deploy:

    $ oc get pod

    Example output

    NAME        READY   STATUS            RESTARTS   AGE
    hello-pod   0/1     SysctlForbidden   0          14s

6.9.3. Enabling unsafe sysctls

A cluster administrator can allow certain unsafe sysctls for very special situations such as high performance or real-time application tuning.

If you want to use unsafe sysctls, a cluster administrator must enable them individually for a specific type of node. The sysctls must be namespaced.

You can further control which sysctls are set in pods by specifying lists of sysctls or sysctl patterns in the allowedUnsafeSysctls field of the Security Context Constraints.

  • The allowedUnsafeSysctls option controls specific needs such as high performance or real-time application tuning.
Warning

Due to their nature of being unsafe, the use of unsafe sysctls is at-your-own-risk and can lead to severe problems, such as improper behavior of containers, resource shortage, or breaking a node.

Procedure

  1. Add a label to the machine config pool where the containers where containers with the unsafe sysctls will run:

    $ oc edit machineconfigpool worker
    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfigPool
    metadata:
      creationTimestamp: 2019-02-08T14:52:39Z
      generation: 1
      labels:
        custom-kubelet: sysctl 1
    1
    Add a key: pair label.
  2. Create a KubeletConfig custom resource (CR):

    apiVersion: machineconfiguration.openshift.io/v1
    kind: KubeletConfig
    metadata:
      name: custom-kubelet
    spec:
      machineConfigPoolSelector:
        matchLabels:
          custom-kubelet: sysctl 1
      kubeletConfig:
        allowedUnsafeSysctls: 2
          - "kernel.msg*"
          - "net.core.somaxconn"
    1
    Specify the label from the machine config pool.
    2
    List the unsafe sysctls you want to allow.
  3. Create the object:

    $ oc apply -f set-sysctl-worker.yaml

    A new MachineConfig object named in the 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet format is created.

  4. Wait for the cluster to reboot using the machineconfigpool object status fields:

    For example:

    status:
      conditions:
        - lastTransitionTime: '2019-08-11T15:32:00Z'
          message: >-
            All nodes are updating to
            rendered-worker-ccbfb5d2838d65013ab36300b7b3dc13
          reason: ''
          status: 'True'
          type: Updating

    A message similar to the following appears when the cluster is ready:

       - lastTransitionTime: '2019-08-11T16:00:00Z'
          message: >-
            All nodes are updated with
            rendered-worker-ccbfb5d2838d65013ab36300b7b3dc13
          reason: ''
          status: 'True'
          type: Updated
  5. When the cluster is ready, check for the merged KubeletConfig object in the new MachineConfig object:

    $ oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7
            "ownerReferences": [
                {
                    "apiVersion": "machineconfiguration.openshift.io/v1",
                    "blockOwnerDeletion": true,
                    "controller": true,
                    "kind": "KubeletConfig",
                    "name": "custom-kubelet",
                    "uid": "3f64a766-bae8-11e9-abe8-0a1a2a4813f2"
                }
            ]

    You can now add unsafe sysctls to pods as needed.