Chapter 3. Special Resource Operator
Learn about the Special Resource Operator (SRO) and how you can use it to build and manage driver containers for loading kernel modules and device drivers on nodes in an OpenShift Container Platform cluster.
The Special Resource Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
3.1. About the Special Resource Operator
The Special Resource Operator (SRO) helps you manage the deployment of kernel modules and drivers on an existing OpenShift Container Platform cluster. The SRO can be used for a case as simple as building and loading a single kernel module, or as complex as deploying the driver, device plugin, and monitoring stack for a hardware accelerator.
For loading kernel modules, the SRO is designed around the use of driver containers. Driver containers are increasingly being used in cloud-native environments, especially when run on pure container operating systems, to deliver hardware drivers to the host. Driver containers extend the kernel stack beyond the out-of-the-box software and hardware features of a specific kernel. Driver containers work on various container-capable Linux distributions. With driver containers, the host operating system stays clean and there is no clash between different library versions or binaries on the host.
3.2. Installing the Special Resource Operator
As a cluster administrator, you can install the Special Resource Operator (SRO) by using the OpenShift CLI or the web console.
3.2.1. Installing the Special Resource Operator by using the CLI
As a cluster administrator, you can install the Special Resource Operator (SRO) by using the OpenShift CLI.
Prerequisites
- You have a running OpenShift Container Platform cluster.
-
You installed the OpenShift CLI (
oc
). -
You are logged into the OpenShift CLI as a user with
cluster-admin
privileges. - You installed the Node Feature Discovery (NFD) Operator.
Procedure
Create a namespace for the Special Resource Operator:
Create the following
Namespace
custom resource (CR) that defines theopenshift-special-resource-operator
namespace, and then save the YAML in thesro-namespace.yaml
file:apiVersion: v1 kind: Namespace metadata: name: openshift-special-resource-operator
Create the namespace by running the following command:
$ oc create -f sro-namespace.yaml
Install the SRO in the namespace you created in the previous step:
Create the following
OperatorGroup
CR and save the YAML in thesro-operatorgroup.yaml
file:apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: generateName: openshift-special-resource-operator- name: openshift-special-resource-operator namespace: openshift-special-resource-operator spec: targetNamespaces: - openshift-special-resource-operator
Create the operator group by running the following command:
$ oc create -f sro-operatorgroup.yaml
Create the following
Subscription
CR and save the YAML in thesro-sub.yaml
file:Example Subscription CR
apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshift-special-resource-operator namespace: openshift-special-resource-operator spec: channel: "stable" installPlanApproval: Automatic name: openshift-special-resource-operator source: redhat-operators sourceNamespace: openshift-marketplace
Create the subscription object by running the following command:
$ oc create -f sro-sub.yaml
Switch to the
openshift-special-resource-operator
project:$ oc project openshift-special-resource-operator
Verification
To verify that the Operator deployment is successful, run:
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE nfd-controller-manager-7f4c5f5778-4lvvk 2/2 Running 0 89s special-resource-controller-manager-6dbf7d4f6f-9kl8h 2/2 Running 0 81s
A successful deployment shows a
Running
status.
3.2.2. Installing the Special Resource Operator by using the web console
As a cluster administrator, you can install the Special Resource Operator (SRO) by using the OpenShift Container Platform web console.
Prerequisites
- You installed the Node Feature Discovery (NFD) Operator.
Procedure
- Log in to the OpenShift Container Platform web console.
Create the required namespace for the Special Resource Operator:
- Navigate to Administration → Namespaces and click Create Namespace.
-
Enter
openshift-special-resource-operator
in the Name field and click Create.
Install the Special Resource Operator:
- In the OpenShift Container Platform web console, click Operators → OperatorHub.
- Choose Special Resource Operator from the list of available Operators, and then click Install.
- On the Install Operator page, select a specific namespace on the cluster, select the namespace created in the previous section, and then click Install.
Verification
To verify that the Special Resource Operator installed successfully:
- Navigate to the Operators → Installed Operators page.
Ensure that Special Resource Operator is listed in the openshift-special-resource-operator project with a Status of InstallSucceeded.
NoteDuring installation, an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the Operator does not appear as installed, to troubleshoot further:
- Navigate to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
-
Navigate to the Workloads → Pods page and check the logs for pods in the
openshift-special-resource-operator
project.
NoteThe Node Feature Discovery (NFD) Operator is a dependency of the Special Resource Operator (SRO). If the NFD Operator is not installed before installing the SRO, the Operator Lifecycle Manager will automatically install the NFD Operator. However, the required Node Feature Discovery operand will not be deployed automatically. The Node Feature Discovery Operator documentation provides details about how to deploy NFD by using the NFD Operator.
3.3. Using the Special Resource Operator
The Special Resource Operator (SRO) is used to manage the build and deployment of a driver container. The objects required to build and deploy the container can be defined in a Helm chart.
The example in this section uses the simple-kmod SpecialResource
object to point to a ConfigMap
object that is created to store the Helm charts.
3.3.1. Building and running the simple-kmod SpecialResource by using a config map
In this example, the simple-kmod kernel module is used to show how the SRO can manage a driver container which is defined in Helm chart templates stored in a config map.
Prerequisites
- You have a running OpenShift Container Platform cluster.
-
You set the Image Registry Operator state to
Managed
for your cluster. -
You installed the OpenShift CLI (
oc
). -
You are logged into the OpenShift CLI as a user with
cluster-admin
privileges. - You installed the Node Feature Discovery (NFD) Operator.
- You installed the Special Resource Operator.
-
You installed the Helm CLI (
helm
).
Procedure
To create a simple-kmod
SpecialResource
object, define an image stream and build config to build the image, and a service account, role, role binding, and daemon set to run the container. The service account, role, and role binding are required to run the daemon set with the privileged security context so that the kernel module can be loaded.Create a
templates
directory, and change into it:$ mkdir -p chart/simple-kmod-0.0.1/templates
$ cd chart/simple-kmod-0.0.1/templates
Save this YAML template for the image stream and build config in the
templates
directory as0000-buildconfig.yaml
:apiVersion: image.openshift.io/v1 kind: ImageStream metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} 1 name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} 2 spec: {} --- apiVersion: build.openshift.io/v1 kind: BuildConfig metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverBuild}} 3 name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverBuild}} 4 annotations: specialresource.openshift.io/wait: "true" specialresource.openshift.io/driver-container-vendor: simple-kmod specialresource.openshift.io/kernel-affine: "true" spec: nodeSelector: node-role.kubernetes.io/worker: "" runPolicy: "Serial" triggers: - type: "ConfigChange" - type: "ImageChange" source: git: ref: {{.Values.specialresource.spec.driverContainer.source.git.ref}} uri: {{.Values.specialresource.spec.driverContainer.source.git.uri}} type: Git strategy: dockerStrategy: dockerfilePath: Dockerfile.SRO buildArgs: - name: "IMAGE" value: {{ .Values.driverToolkitImage }} {{- range $arg := .Values.buildArgs }} - name: {{ $arg.name }} value: {{ $arg.value }} {{- end }} - name: KVER value: {{ .Values.kernelFullVersion }} output: to: kind: ImageStreamTag name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}}:v{{.Values.kernelFullVersion}} 5
Save the following YAML template for the RBAC resources and daemon set in the
templates
directory as1000-driver-container.yaml
:apiVersion: v1 kind: ServiceAccount metadata: name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} rules: - apiGroups: - security.openshift.io resources: - securitycontextconstraints verbs: - use resourceNames: - privileged --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} subjects: - kind: ServiceAccount name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} namespace: {{.Values.specialresource.spec.namespace}} --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} annotations: specialresource.openshift.io/wait: "true" specialresource.openshift.io/state: "driver-container" specialresource.openshift.io/driver-container-vendor: simple-kmod specialresource.openshift.io/kernel-affine: "true" specialresource.openshift.io/from-configmap: "true" spec: updateStrategy: type: OnDelete selector: matchLabels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} template: metadata: labels: app: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} spec: priorityClassName: system-node-critical serviceAccount: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} serviceAccountName: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} containers: - image: image-registry.openshift-image-registry.svc:5000/{{.Values.specialresource.spec.namespace}}/{{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}}:v{{.Values.kernelFullVersion}} name: {{.Values.specialresource.metadata.name}}-{{.Values.groupName.driverContainer}} imagePullPolicy: Always command: ["/sbin/init"] lifecycle: preStop: exec: command: ["/bin/sh", "-c", "systemctl stop kmods-via-containers@{{.Values.specialresource.metadata.name}}"] securityContext: privileged: true nodeSelector: node-role.kubernetes.io/worker: "" feature.node.kubernetes.io/kernel-version.full: "{{.Values.KernelFullVersion}}"
Change into the
chart/simple-kmod-0.0.1
directory:$ cd ..
Save the following YAML for the chart as
Chart.yaml
in thechart/simple-kmod-0.0.1
directory:apiVersion: v2 name: simple-kmod description: Simple kmod will deploy a simple kmod driver-container icon: https://avatars.githubusercontent.com/u/55542927 type: application version: 0.0.1 appVersion: 1.0.0
From the
chart
directory, create the chart using thehelm package
command:$ helm package simple-kmod-0.0.1/
Example output
Successfully packaged chart and saved it to: /data/<username>/git/<github_username>/special-resource-operator/yaml-for-docs/chart/simple-kmod-0.0.1/simple-kmod-0.0.1.tgz
Create a config map to store the chart files:
Create a directory for the config map files:
$ mkdir cm
Copy the Helm chart into the
cm
directory:$ cp simple-kmod-0.0.1.tgz cm/simple-kmod-0.0.1.tgz
Create an index file specifying the Helm repo that contains the Helm chart:
$ helm repo index cm --url=cm://simple-kmod/simple-kmod-chart
Create a namespace for the objects defined in the Helm chart:
$ oc create namespace simple-kmod
Create the config map object:
$ oc create cm simple-kmod-chart --from-file=cm/index.yaml --from-file=cm/simple-kmod-0.0.1.tgz -n simple-kmod
Use the following
SpecialResource
manifest to deploy the simple-kmod object using the Helm chart that you created in the config map. Save this YAML assimple-kmod-configmap.yaml
:apiVersion: sro.openshift.io/v1beta1 kind: SpecialResource metadata: name: simple-kmod spec: #debug: true 1 namespace: simple-kmod chart: name: simple-kmod version: 0.0.1 repository: name: example url: cm://simple-kmod/simple-kmod-chart 2 set: kind: Values apiVersion: sro.openshift.io/v1beta1 kmodNames: ["simple-kmod", "simple-procfs-kmod"] buildArgs: - name: "KMODVER" value: "SRO" driverContainer: source: git: ref: "master" uri: "https://github.com/openshift-psap/kvc-simple-kmod.git"
From a command line, create the
SpecialResource
file:$ oc create -f simple-kmod-configmap.yaml
The
simple-kmod
resources are deployed in thesimple-kmod
namespace as specified in the object manifest. After a short time, the build pod for thesimple-kmod
driver container starts running. The build completes after a few minutes, and then the driver container pods start running.Use
oc get pods
command to display the status of the build pods:$ oc get pods -n simple-kmod
Example output
NAME READY STATUS RESTARTS AGE simple-kmod-driver-build-12813789169ac0ee-1-build 0/1 Completed 0 7m12s simple-kmod-driver-container-12813789169ac0ee-mjsnh 1/1 Running 0 8m2s simple-kmod-driver-container-12813789169ac0ee-qtkff 1/1 Running 0 8m2s
Use the
oc logs
command, along with the build pod name obtained from theoc get pods
command above, to display the logs of the simple-kmod driver container image build:$ oc logs pod/simple-kmod-driver-build-12813789169ac0ee-1-build -n simple-kmod
To verify that the simple-kmod kernel modules are loaded, execute the
lsmod
command in one of the driver container pods that was returned from theoc get pods
command above:$ oc exec -n simple-kmod -it pod/simple-kmod-driver-container-12813789169ac0ee-mjsnh -- lsmod | grep simple
Example output
simple_procfs_kmod 16384 0 simple_kmod 16384 0
If you want to remove the simple-kmod kernel module from the node, delete the simple-kmod SpecialResource
API object using the oc delete
command. The kernel module is unloaded when the driver container pod is deleted.
3.4. Additional resources
- For information about restoring the Image Registry Operator state before using the Special Resource Operator, see Image registry removed during installation.
- For details about installing the NFD Operator see Node Feature Discovery (NFD) Operator.