Chapter 4. Administrator tasks

4.1. Adding Operators to a cluster

Using Operator Lifecycle Manager (OLM), cluster administrators can install OLM-based Operators to an OpenShift Container Platform cluster.

Note

For information on how OLM handles updates for installed Operators colocated in the same namespace, as well as an alternative method for installing Operators with custom global Operator groups, see Multitenancy and Operator colocation.

4.1.1. About Operator installation with OperatorHub

OperatorHub is a user interface for discovering Operators; it works in conjunction with Operator Lifecycle Manager (OLM), which installs and manages Operators on a cluster.

As a user with the proper permissions, you can install an Operator from OperatorHub by using the OpenShift Container Platform web console or CLI.

During installation, you must determine the following initial settings for the Operator:

Installation Mode
Choose a specific namespace in which to install the Operator.
Update Channel
If an Operator is available through multiple channels, you can choose which channel you want to subscribe to. For example, to deploy from the stable channel, if available, select it from the list.
Approval Strategy

You can choose automatic or manual updates.

If you choose automatic updates for an installed Operator, when a new version of that Operator is available in the selected channel, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator without human intervention.

If you select manual updates, when a newer version of an Operator is available, OLM creates an update request. As a cluster administrator, you must then manually approve that update request to have the Operator updated to the new version.

Additional resources

4.1.2. Installing from OperatorHub using the web console

You can install and subscribe to an Operator from OperatorHub by using the OpenShift Container Platform web console.

Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with cluster-admin permissions.
  • Access to an OpenShift Container Platform cluster using an account with Operator installation permissions.

Procedure

  1. Navigate in the web console to the Operators → OperatorHub page.
  2. Scroll or type a keyword into the Filter by keyword box to find the Operator you want. For example, type advanced to find the Advanced Cluster Management for Kubernetes Operator.

    You can also filter options by Infrastructure Features. For example, select Disconnected if you want to see Operators that work in disconnected environments, also known as restricted network environments.

  3. Select the Operator to display additional information.

    Note

    Choosing a Community Operator warns that Red Hat does not certify Community Operators; you must acknowledge the warning before continuing.

  4. Read the information about the Operator and click Install.
  5. On the Install Operator page:

    1. Select one of the following:

      • All namespaces on the cluster (default) installs the Operator in the default openshift-operators namespace to watch and be made available to all namespaces in the cluster. This option is not always available.
      • A specific namespace on the cluster allows you to choose a specific, single namespace in which to install the Operator. The Operator will only watch and be made available for use in this single namespace.
    2. Choose a specific, single namespace in which to install the Operator. The Operator will only watch and be made available for use in this single namespace.
    3. Select an Update channel (if more than one is available).
    4. Select Automatic or Manual approval strategy, as described earlier.
  6. Click Install to make the Operator available to the selected namespaces on this OpenShift Container Platform cluster.

    1. If you selected a Manual approval strategy, the upgrade status of the subscription remains Upgrading until you review and approve the install plan.

      After approving on the Install Plan page, the subscription upgrade status moves to Up to date.

    2. If you selected an Automatic approval strategy, the upgrade status should resolve to Up to date without intervention.
  7. After the upgrade status of the subscription is Up to date, select Operators → Installed Operators to verify that the cluster service version (CSV) of the installed Operator eventually shows up. The Status should ultimately resolve to InstallSucceeded in the relevant namespace.

    Note

    For the All namespaces…​ installation mode, the status resolves to InstallSucceeded in the openshift-operators namespace, but the status is Copied if you check in other namespaces.

    If it does not:

    1. Check the logs in any pods in the openshift-operators project (or other relevant namespace if A specific namespace…​ installation mode was selected) on the Workloads → Pods page that are reporting issues to troubleshoot further.

4.1.3. Installing from OperatorHub using the CLI

Instead of using the OpenShift Container Platform web console, you can install an Operator from OperatorHub by using the CLI. Use the oc command to create or update a Subscription object.

Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with Operator installation permissions.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. View the list of Operators available to the cluster from OperatorHub:

    $ oc get packagemanifests -n openshift-marketplace

    Example output

    NAME                               CATALOG               AGE
    3scale-operator                    Red Hat Operators     91m
    advanced-cluster-management        Red Hat Operators     91m
    amq7-cert-manager                  Red Hat Operators     91m
    ...
    couchbase-enterprise-certified     Certified Operators   91m
    crunchy-postgres-operator          Certified Operators   91m
    mongodb-enterprise                 Certified Operators   91m
    ...
    etcd                               Community Operators   91m
    jaeger                             Community Operators   91m
    kubefed                            Community Operators   91m
    ...

    Note the catalog for your desired Operator.

  2. Inspect your desired Operator to verify its supported install modes and available channels:

    $ oc describe packagemanifests <operator_name> -n openshift-marketplace
  3. An Operator group, defined by an OperatorGroup object, selects target namespaces in which to generate required RBAC access for all Operators in the same namespace as the Operator group.

    The namespace to which you subscribe the Operator must have an Operator group that matches the install mode of the Operator, either the AllNamespaces or SingleNamespace mode. If the Operator you intend to install uses the AllNamespaces, then the openshift-operators namespace already has an appropriate Operator group in place.

    However, if the Operator uses the SingleNamespace mode and you do not already have an appropriate Operator group in place, you must create one.

    Note

    The web console version of this procedure handles the creation of the OperatorGroup and Subscription objects automatically behind the scenes for you when choosing SingleNamespace mode.

    1. Create an OperatorGroup object YAML file, for example operatorgroup.yaml:

      Example OperatorGroup object

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: <operatorgroup_name>
        namespace: <namespace>
      spec:
        targetNamespaces:
        - <namespace>

      Warning

      Operator Lifecycle Manager (OLM) creates the following cluster roles for each Operator group:

      • <operatorgroup_name>-admin
      • <operatorgroup_name>-edit
      • <operatorgroup_name>-view

      When you manually create an Operator group, you must specify a unique name that does not conflict with the existing cluster roles or other Operator groups on the cluster.

    2. Create the OperatorGroup object:

      $ oc apply -f operatorgroup.yaml
  4. Create a Subscription object YAML file to subscribe a namespace to an Operator, for example sub.yaml:

    Example Subscription object

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: <subscription_name>
      namespace: openshift-operators 1
    spec:
      channel: <channel_name> 2
      name: <operator_name> 3
      source: redhat-operators 4
      sourceNamespace: openshift-marketplace 5
      config:
        env: 6
        - name: ARGS
          value: "-v=10"
        envFrom: 7
        - secretRef:
            name: license-secret
        volumes: 8
        - name: <volume_name>
          configMap:
            name: <configmap_name>
        volumeMounts: 9
        - mountPath: <directory_name>
          name: <volume_name>
        tolerations: 10
        - operator: "Exists"
        resources: 11
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"
        nodeSelector: 12
          foo: bar

    1
    For default AllNamespaces install mode usage, specify the openshift-operators namespace. Alternatively, you can specify a custom global namespace, if you have created one. Otherwise, specify the relevant single namespace for SingleNamespace install mode usage.
    2
    Name of the channel to subscribe to.
    3
    Name of the Operator to subscribe to.
    4
    Name of the catalog source that provides the Operator.
    5
    Namespace of the catalog source. Use openshift-marketplace for the default OperatorHub catalog sources.
    6
    The env parameter defines a list of Environment Variables that must exist in all containers in the pod created by OLM.
    7
    The envFrom parameter defines a list of sources to populate Environment Variables in the container.
    8
    The volumes parameter defines a list of Volumes that must exist on the pod created by OLM.
    9
    The volumeMounts parameter defines a list of VolumeMounts that must exist in all containers in the pod created by OLM. If a volumeMount references a volume that does not exist, OLM fails to deploy the Operator.
    10
    The tolerations parameter defines a list of Tolerations for the pod created by OLM.
    11
    The resources parameter defines resource constraints for all the containers in the pod created by OLM.
    12
    The nodeSelector parameter defines a NodeSelector for the pod created by OLM.
  5. Create the Subscription object:

    $ oc apply -f sub.yaml

    At this point, OLM is now aware of the selected Operator. A cluster service version (CSV) for the Operator should appear in the target namespace, and APIs provided by the Operator should be available for creation.

Additional resources

4.1.4. Installing a specific version of an Operator

You can install a specific version of an Operator by setting the cluster service version (CSV) in a Subscription object.

Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with Operator installation permissions.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Look up the available versions and channels of the Operator you want to install by running the following command:

    Command syntax

    $ oc describe packagemanifests <operator_name> -n <catalog_namespace>

    For example, the following command prints the available channels and versions of the Red Hat Quay Operator from OperatorHub:

    Example command

    $ oc describe packagemanifests quay-operator -n openshift-marketplace

    Example 4.1. Example output

    Name:         quay-operator
    Namespace:    operator-marketplace
    Labels:       catalog=redhat-operators
                  catalog-namespace=openshift-marketplace
                  hypershift.openshift.io/managed=true
                  operatorframework.io/arch.amd64=supported
                  operatorframework.io/os.linux=supported
                  provider=Red Hat
                  provider-url=
    Annotations:  <none>
    API Version:  packages.operators.coreos.com/v1
    Kind:         PackageManifest
    ...
        Current CSV:  quay-operator.v3.7.11
    ...
        Entries:
          Name:       quay-operator.v3.7.11
          Version:    3.7.11
          Name:       quay-operator.v3.7.10
          Version:    3.7.10
          Name:       quay-operator.v3.7.9
          Version:    3.7.9
          Name:       quay-operator.v3.7.8
          Version:    3.7.8
          Name:       quay-operator.v3.7.7
          Version:    3.7.7
          Name:       quay-operator.v3.7.6
          Version:    3.7.6
          Name:       quay-operator.v3.7.5
          Version:    3.7.5
          Name:       quay-operator.v3.7.4
          Version:    3.7.4
          Name:       quay-operator.v3.7.3
          Version:    3.7.3
          Name:       quay-operator.v3.7.2
          Version:    3.7.2
          Name:       quay-operator.v3.7.1
          Version:    3.7.1
          Name:       quay-operator.v3.7.0
          Version:    3.7.0
        Name:         stable-3.7
    ...
       Current CSV:  quay-operator.v3.8.5
    ...
       Entries:
          Name:         quay-operator.v3.8.5
          Version:      3.8.5
          Name:         quay-operator.v3.8.4
          Version:      3.8.4
          Name:         quay-operator.v3.8.3
          Version:      3.8.3
          Name:         quay-operator.v3.8.2
          Version:      3.8.2
          Name:         quay-operator.v3.8.1
          Version:      3.8.1
          Name:         quay-operator.v3.8.0
          Version:      3.8.0
        Name:           stable-3.8
      Default Channel:  stable-3.8
      Package Name:     quay-operator
    Tip

    You can print an Operator’s version and channel information in the YAML format by running the following command:

    $ oc get packagemanifests <operator_name> -n <catalog_namespace> -o yaml
    • If more than one catalog is installed in a namespace, run the following command to look up the available versions and channels of an Operator from a specific catalog:

      $ oc get packagemanifest \
         --selector=catalog=<catalogsource_name> \
         --field-selector metadata.name=<operator_name> \
         -n <catalog_namespace> -o yaml
      Important

      If you do not specify the Operator’s catalog, running the oc get packagemanifest and oc describe packagemanifest commands might return a package from an unexpected catalog if the following conditions are met:

      • Multiple catalogs are installed in the same namespace.
      • The catalogs contain the same Operators or Operators with the same name.
  2. An Operator group, defined by an OperatorGroup object, selects target namespaces in which to generate required role-based access control (RBAC) access for all Operators in the same namespace as the Operator group.

    The namespace to which you subscribe the Operator must have an Operator group that matches the install mode of the Operator, either the AllNamespaces or SingleNamespace mode. If the Operator you intend to install uses the AllNamespaces mode, then the openshift-operators namespace already has an appropriate Operator group in place.

    However, if the Operator uses the SingleNamespace mode and you do not already have an appropriate Operator group in place, you must create one:

    1. Create an OperatorGroup object YAML file, for example operatorgroup.yaml:

      Example OperatorGroup object

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: <operatorgroup_name>
        namespace: <namespace>
      spec:
        targetNamespaces:
        - <namespace>

      Warning

      Operator Lifecycle Manager (OLM) creates the following cluster roles for each Operator group:

      • <operatorgroup_name>-admin
      • <operatorgroup_name>-edit
      • <operatorgroup_name>-view

      When you manually create an Operator group, you must specify a unique name that does not conflict with the existing cluster roles or other Operator groups on the cluster.

    2. Create the OperatorGroup object:

      $ oc apply -f operatorgroup.yaml
  3. Create a Subscription object YAML file that subscribes a namespace to an Operator with a specific version by setting the startingCSV field. Set the installPlanApproval field to Manual to prevent the Operator from automatically upgrading if a later version exists in the catalog.

    For example, the following sub.yaml file can be used to install the Red Hat Quay Operator specifically to version 3.7.10:

    Subscription with a specific starting Operator version

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: quay-operator
      namespace: quay
    spec:
      channel: quay-operator.v3.7.10
      installPlanApproval: Manual 1
      name: quay-operator
      source: redhat-operators
      sourceNamespace: openshift-marketplace
      startingCSV: quay-operator.v3.7.10 2

    1
    Set the approval strategy to Manual in case your specified version is superseded by a later version in the catalog. This plan prevents an automatic upgrade to a later version and requires manual approval before the starting CSV can complete the installation.
    2
    Set a specific version of an Operator CSV.
  4. Create the Subscription object:

    $ oc apply -f sub.yaml
  5. Manually approve the pending install plan to complete the Operator installation.

4.1.5. Preparing for multiple instances of an Operator for multitenant clusters

As a cluster administrator, you can add multiple instances of an Operator for use in multitenant clusters. This is an alternative solution to either using the standard All namespaces install mode, which can be considered to violate the principle of least privilege, or the Multinamespace mode, which is not widely adopted. For more information, see "Operators in multitenant clusters".

In the following procedure, the tenant is a user or group of users that share common access and privileges for a set of deployed workloads. The tenant Operator is the instance of an Operator that is intended for use by only that tenant.

Prerequisites

  • All instances of the Operator you want to install must be the same version across a given cluster.

    Important

    For more information on this and other limitations, see "Operators in multitenant clusters".

Procedure

  1. Before installing the Operator, create a namespace for the tenant Operator that is separate from the tenant’s namespace. For example, if the tenant’s namespace is team1, you might create a team1-operator namespace:

    1. Define a Namespace resource and save the YAML file, for example, team1-operator.yaml:

      apiVersion: v1
      kind: Namespace
      metadata:
        name: team1-operator
    2. Create the namespace by running the following command:

      $ oc create -f team1-operator.yaml
  2. Create an Operator group for the tenant Operator scoped to the tenant’s namespace, with only that one namespace entry in the spec.targetNamespaces list:

    1. Define an OperatorGroup resource and save the YAML file, for example, team1-operatorgroup.yaml:

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: team1-operatorgroup
        namespace: team1-operator
      spec:
        targetNamespaces:
        - team1 1
      1
      Define only the tenant’s namespace in the spec.targetNamespaces list.
    2. Create the Operator group by running the following command:

      $ oc create -f team1-operatorgroup.yaml

Next steps

  • Install the Operator in the tenant Operator namespace. This task is more easily performed by using the OperatorHub in the web console instead of the CLI; for a detailed procedure, see Installing from OperatorHub using the web console.

    Note

    After completing the Operator installation, the Operator resides in the tenant Operator namespace and watches the tenant namespace, but neither the Operator’s pod nor its service account are visible or usable by the tenant.

Additional resources

4.1.6. Installing global Operators in custom namespaces

When installing Operators with the OpenShift Container Platform web console, the default behavior installs Operators that support the All namespaces install mode into the default openshift-operators global namespace. This can cause issues related to shared install plans and update policies between all Operators in the namespace. For more details on these limitations, see "Multitenancy and Operator colocation".

As a cluster administrator, you can bypass this default behavior manually by creating a custom global namespace and using that namespace to install your individual or scoped set of Operators and their dependencies.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Before installing the Operator, create a namespace for the installation of your desired Operator. This installation namespace will become the custom global namespace:

    1. Define a Namespace resource and save the YAML file, for example, global-operators.yaml:

      apiVersion: v1
      kind: Namespace
      metadata:
        name: global-operators
    2. Create the namespace by running the following command:

      $ oc create -f global-operators.yaml
  2. Create a custom global Operator group, which is an Operator group that watches all namespaces:

    1. Define an OperatorGroup resource and save the YAML file, for example, global-operatorgroup.yaml. Omit both the spec.selector and spec.targetNamespaces fields to make it a global Operator group, which selects all namespaces:

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: global-operatorgroup
        namespace: global-operators
      Note

      The status.namespaces of a created global Operator group contains the empty string (""), which signals to a consuming Operator that it should watch all namespaces.

    2. Create the Operator group by running the following command:

      $ oc create -f global-operatorgroup.yaml

Next steps

  • Install the desired Operator in your custom global namespace. Because the web console does not populate the Installed Namespace menu during Operator installation with custom global namespaces, this task can only be performed with the OpenShift CLI (oc). For a detailed procedure, see Installing from OperatorHub using the CLI.

    Note

    When you initiate the Operator installation, if the Operator has dependencies, the dependencies are also automatically installed in the custom global namespace. As a result, it is then valid for the dependency Operators to have the same update policy and shared install plans.

4.1.7. Pod placement of Operator workloads

By default, Operator Lifecycle Manager (OLM) places pods on arbitrary worker nodes when installing an Operator or deploying Operand workloads. As an administrator, you can use projects with a combination of node selectors, taints, and tolerations to control the placement of Operators and Operands to specific nodes.

Controlling pod placement of Operator and Operand workloads has the following prerequisites:

  1. Determine a node or set of nodes to target for the pods per your requirements. If available, note an existing label, such as node-role.kubernetes.io/app, that identifies the node or nodes. Otherwise, add a label, such as myoperator, by using a compute machine set or editing the node directly. You will use this label in a later step as the node selector on your project.
  2. If you want to ensure that only pods with a certain label are allowed to run on the nodes, while steering unrelated workloads to other nodes, add a taint to the node or nodes by using a compute machine set or editing the node directly. Use an effect that ensures that new pods that do not match the taint cannot be scheduled on the nodes. For example, a myoperator:NoSchedule taint ensures that new pods that do not match the taint are not scheduled onto that node, but existing pods on the node are allowed to remain.
  3. Create a project that is configured with a default node selector and, if you added a taint, a matching toleration.

At this point, the project you created can be used to steer pods towards the specified nodes in the following scenarios:

For Operator pods
Administrators can create a Subscription object in the project as described in the following section. As a result, the Operator pods are placed on the specified nodes.
For Operand pods
Using an installed Operator, users can create an application in the project, which places the custom resource (CR) owned by the Operator in the project. As a result, the Operand pods are placed on the specified nodes, unless the Operator is deploying cluster-wide objects or resources in other namespaces, in which case this customized pod placement does not apply.

4.1.8. Controlling where an Operator is installed

By default, when you install an Operator, OpenShift Container Platform installs the Operator pod to one of your worker nodes randomly. However, there might be situations where you want that pod scheduled on a specific node or set of nodes.

The following examples describe situations where you might want to schedule an Operator pod to a specific node or set of nodes:

  • If an Operator requires a particular platform, such as amd64 or arm64
  • If an Operator requires a particular operating system, such as Linux or Windows
  • If you want Operators that work together scheduled on the same host or on hosts located on the same rack
  • If you want Operators dispersed throughout the infrastructure to avoid downtime due to network or hardware issues

You can control where an Operator pod is installed by adding node affinity, pod affinity, or pod anti-affinity constraints to the Operator’s Subscription object. Node affinity is a set of rules used by the scheduler to determine where a pod can be placed. Pod affinity enables you to ensure that related pods are scheduled to the same node. Pod anti-affinity allows you to prevent a pod from being scheduled on a node.

The following examples show how to use node affinity or pod anti-affinity to install an instance of the Custom Metrics Autoscaler Operator to a specific node in the cluster:

Node affinity example that places the Operator pod on a specific node

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-custom-metrics-autoscaler-operator
  namespace: openshift-keda
spec:
  name: my-package
  source: my-operators
  sourceNamespace: operator-registries
  config:
    affinity:
      nodeAffinity: 1
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
              - ip-10-0-163-94.us-west-2.compute.internal
#...

1
A node affinity that requires the Operator’s pod to be scheduled on a node named ip-10-0-163-94.us-west-2.compute.internal.

Node affinity example that places the Operator pod on a node with a specific platform

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-custom-metrics-autoscaler-operator
  namespace: openshift-keda
spec:
  name: my-package
  source: my-operators
  sourceNamespace: operator-registries
  config:
    affinity:
      nodeAffinity: 1
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: kubernetes.io/arch
              operator: In
              values:
              - arm64
            - key: kubernetes.io/os
              operator: In
              values:
              - linux
#...

1
A node affinity that requires the Operator’s pod to be scheduled on a node with the kubernetes.io/arch=arm64 and kubernetes.io/os=linux labels.

Pod affinity example that places the Operator pod on one or more specific nodes

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-custom-metrics-autoscaler-operator
  namespace: openshift-keda
spec:
  name: my-package
  source: my-operators
  sourceNamespace: operator-registries
  config:
    affinity:
      podAffinity: 1
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - test
          topologyKey: kubernetes.io/hostname
#...

1
A pod affinity that places the Operator’s pod on a node that has pods with the app=test label.

Pod anti-affinity example that prevents the Operator pod from one or more specific nodes

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: openshift-custom-metrics-autoscaler-operator
  namespace: openshift-keda
spec:
  name: my-package
  source: my-operators
  sourceNamespace: operator-registries
  config:
    affinity:
      podAntiAffinity: 1
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: cpu
              operator: In
              values:
              - high
          topologyKey: kubernetes.io/hostname
#...

1
A pod anti-affinity that prevents the Operator’s pod from being scheduled on a node that has pods with the cpu=high label.

Procedure

To control the placement of an Operator pod, complete the following steps:

  1. Install the Operator as usual.
  2. If needed, ensure that your nodes are labeled to properly respond to the affinity.
  3. Edit the Operator Subscription object to add an affinity:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: openshift-custom-metrics-autoscaler-operator
      namespace: openshift-keda
    spec:
      name: my-package
      source: my-operators
      sourceNamespace: operator-registries
      config:
        affinity: 1
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - ip-10-0-185-229.ec2.internal
    #...
    1
    Add a nodeAffinity, podAffinity, or podAntiAffinity. See the Additional resources section that follows for information about creating the affinity.

Verification

  • To ensure that the pod is deployed on the specific node, run the following command:

    $ oc get pods -o wide

    Example output

    NAME                                                  READY   STATUS    RESTARTS   AGE   IP            NODE                           NOMINATED NODE   READINESS GATES
    custom-metrics-autoscaler-operator-5dcc45d656-bhshg   1/1     Running   0          50s   10.131.0.20   ip-10-0-185-229.ec2.internal   <none>           <none>

4.2. Updating installed Operators

As a cluster administrator, you can update Operators that have been previously installed using Operator Lifecycle Manager (OLM) on your OpenShift Container Platform cluster.

Note

For information on how OLM handles updates for installed Operators colocated in the same namespace, as well as an alternative method for installing Operators with custom global Operator groups, see Multitenancy and Operator colocation.

4.2.1. Preparing for an Operator update

The subscription of an installed Operator specifies an update channel that tracks and receives updates for the Operator. You can change the update channel to start tracking and receiving updates from a newer channel.

The names of update channels in a subscription can differ between Operators, but the naming scheme typically follows a common convention within a given Operator. For example, channel names might follow a minor release update stream for the application provided by the Operator (1.2, 1.3) or a release frequency (stable, fast).

Note

You cannot change installed Operators to a channel that is older than the current channel.

Red Hat Customer Portal Labs include the following application that helps administrators prepare to update their Operators:

You can use the application to search for Operator Lifecycle Manager-based Operators and verify the available Operator version per update channel across different versions of OpenShift Container Platform. Cluster Version Operator-based Operators are not included.

4.2.2. Changing the update channel for an Operator

You can change the update channel for an Operator by using the OpenShift Container Platform web console.

Tip

If the approval strategy in the subscription is set to Automatic, the update process initiates as soon as a new Operator version is available in the selected channel. If the approval strategy is set to Manual, you must manually approve pending updates.

Prerequisites

  • An Operator previously installed using Operator Lifecycle Manager (OLM).

Procedure

  1. In the Administrator perspective of the web console, navigate to Operators → Installed Operators.
  2. Click the name of the Operator you want to change the update channel for.
  3. Click the Subscription tab.
  4. Click the name of the update channel under Update channel.
  5. Click the newer update channel that you want to change to, then click Save.
  6. For subscriptions with an Automatic approval strategy, the update begins automatically. Navigate back to the Operators → Installed Operators page to monitor the progress of the update. When complete, the status changes to Succeeded and Up to date.

    For subscriptions with a Manual approval strategy, you can manually approve the update from the Subscription tab.

4.2.3. Manually approving a pending Operator update

If an installed Operator has the approval strategy in its subscription set to Manual, when new updates are released in its current update channel, the update must be manually approved before installation can begin.

Prerequisites

  • An Operator previously installed using Operator Lifecycle Manager (OLM).

Procedure

  1. In the Administrator perspective of the OpenShift Container Platform web console, navigate to Operators → Installed Operators.
  2. Operators that have a pending update display a status with Upgrade available. Click the name of the Operator you want to update.
  3. Click the Subscription tab. Any updates requiring approval are displayed next to Upgrade status. For example, it might display 1 requires approval.
  4. Click 1 requires approval, then click Preview Install Plan.
  5. Review the resources that are listed as available for update. When satisfied, click Approve.
  6. Navigate back to the Operators → Installed Operators page to monitor the progress of the update. When complete, the status changes to Succeeded and Up to date.

4.3. Deleting Operators from a cluster

The following describes how to delete, or uninstall, Operators that were previously installed using Operator Lifecycle Manager (OLM) on your OpenShift Container Platform cluster.

Important

You must successfully and completely uninstall an Operator prior to attempting to reinstall the same Operator. Failure to fully uninstall the Operator properly can leave resources, such as a project or namespace, stuck in a "Terminating" state and cause "error resolving resource" messages to be observed when trying to reinstall the Operator. For more information, see Reinstalling Operators after failed uninstallation.

4.3.1. Deleting Operators from a cluster using the web console

Cluster administrators can delete installed Operators from a selected namespace by using the web console.

Prerequisites

  • You have access to an OpenShift Container Platform cluster web console using an account with cluster-admin permissions.

Procedure

  1. Navigate to the OperatorsInstalled Operators page.
  2. Scroll or enter a keyword into the Filter by name field to find the Operator that you want to remove. Then, click on it.
  3. On the right side of the Operator Details page, select Uninstall Operator from the Actions list.

    An Uninstall Operator? dialog box is displayed.

  4. Select Uninstall to remove the Operator, Operator deployments, and pods. Following this action, the Operator stops running and no longer receives updates.

    Note

    This action does not remove resources managed by the Operator, including custom resource definitions (CRDs) and custom resources (CRs). Dashboards and navigation items enabled by the web console and off-cluster resources that continue to run might need manual clean up. To remove these after uninstalling the Operator, you might need to manually delete the Operator CRDs.

4.3.2. Deleting Operators from a cluster using the CLI

Cluster administrators can delete installed Operators from a selected namespace by using the CLI.

Prerequisites

  • You have access to an OpenShift Container Platform cluster using an account with cluster-admin permissions.
  • The OpenShift CLI (oc) is installed on your workstation.

Procedure

  1. Ensure the latest version of the subscribed operator (for example, serverless-operator) is identified in the currentCSV field.

    $ oc get subscription.operators.coreos.com serverless-operator -n openshift-serverless -o yaml | grep currentCSV

    Example output

      currentCSV: serverless-operator.v1.28.0

  2. Delete the subscription (for example, serverless-operator):

    $ oc delete subscription.operators.coreos.com serverless-operator -n openshift-serverless

    Example output

    subscription.operators.coreos.com "serverless-operator" deleted

  3. Delete the CSV for the Operator in the target namespace using the currentCSV value from the previous step:

    $ oc delete clusterserviceversion serverless-operator.v1.28.0 -n openshift-serverless

    Example output

    clusterserviceversion.operators.coreos.com "serverless-operator.v1.28.0" deleted

4.3.3. Refreshing failing subscriptions

In Operator Lifecycle Manager (OLM), if you subscribe to an Operator that references images that are not accessible on your network, you can find jobs in the openshift-marketplace namespace that are failing with the following errors:

Example output

ImagePullBackOff for
Back-off pulling image "example.com/openshift4/ose-elasticsearch-operator-bundle@sha256:6d2587129c846ec28d384540322b40b05833e7e00b25cca584e004af9a1d292e"

Example output

rpc error: code = Unknown desc = error pinging docker registry example.com: Get "https://example.com/v2/": dial tcp: lookup example.com on 10.0.0.1:53: no such host

As a result, the subscription is stuck in this failing state and the Operator is unable to install or upgrade.

You can refresh a failing subscription by deleting the subscription, cluster service version (CSV), and other related objects. After recreating the subscription, OLM then reinstalls the correct version of the Operator.

Prerequisites

  • You have a failing subscription that is unable to pull an inaccessible bundle image.
  • You have confirmed that the correct bundle image is accessible.

Procedure

  1. Get the names of the Subscription and ClusterServiceVersion objects from the namespace where the Operator is installed:

    $ oc get sub,csv -n <namespace>

    Example output

    NAME                                                       PACKAGE                  SOURCE             CHANNEL
    subscription.operators.coreos.com/elasticsearch-operator   elasticsearch-operator   redhat-operators   5.0
    
    NAME                                                                         DISPLAY                            VERSION    REPLACES   PHASE
    clusterserviceversion.operators.coreos.com/elasticsearch-operator.5.0.0-65   OpenShift Elasticsearch Operator   5.0.0-65              Succeeded

  2. Delete the subscription:

    $ oc delete subscription <subscription_name> -n <namespace>
  3. Delete the cluster service version:

    $ oc delete csv <csv_name> -n <namespace>
  4. Get the names of any failing jobs and related config maps in the openshift-marketplace namespace:

    $ oc get job,configmap -n openshift-marketplace

    Example output

    NAME                                                                        COMPLETIONS   DURATION   AGE
    job.batch/1de9443b6324e629ddf31fed0a853a121275806170e34c926d69e53a7fcbccb   1/1           26s        9m30s
    
    NAME                                                                        DATA   AGE
    configmap/1de9443b6324e629ddf31fed0a853a121275806170e34c926d69e53a7fcbccb   3      9m30s

  5. Delete the job:

    $ oc delete job <job_name> -n openshift-marketplace

    This ensures pods that try to pull the inaccessible image are not recreated.

  6. Delete the config map:

    $ oc delete configmap <configmap_name> -n openshift-marketplace
  7. Reinstall the Operator using OperatorHub in the web console.

Verification

  • Check that the Operator has been reinstalled successfully:

    $ oc get sub,csv,installplan -n <namespace>

4.4. Configuring Operator Lifecycle Manager features

The Operator Lifecycle Manager (OLM) controller is configured by an OLMConfig custom resource (CR) named cluster. Cluster administrators can modify this resource to enable or disable certain features.

This document outlines the features currently supported by OLM that are configured by the OLMConfig resource.

4.4.1. Disabling copied CSVs

When an Operator is installed by Operator Lifecycle Manager (OLM), a simplified copy of its cluster service version (CSV) is created by default in every namespace that the Operator is configured to watch. These CSVs are known as copied CSVs and communicate to users which controllers are actively reconciling resource events in a given namespace.

When an Operator is configured to use the AllNamespaces install mode, versus targeting a single or specified set of namespaces, a copied CSV for the Operator is created in every namespace on the cluster. On especially large clusters, with namespaces and installed Operators potentially in the hundreds or thousands, copied CSVs consume an untenable amount of resources, such as OLM’s memory usage, cluster etcd limits, and networking.

To support these larger clusters, cluster administrators can disable copied CSVs for Operators globally installed with the AllNamespaces mode.

Note

If you disable copied CSVs, an Operator installed in AllNamespaces mode has their CSV copied only to the openshift namespace, instead of every namespace on the cluster. In disabled copied CSVs mode, the behavior differs between the web console and CLI:

  • In the web console, the default behavior is modified to show copied CSVs from the openshift namespace in every namespace, even though the CSVs are not actually copied to every namespace. This allows regular users to still be able to view the details of these Operators in their namespaces and create related custom resources (CRs).
  • In the OpenShift CLI (oc), regular users can view Operators installed directly in their namespaces by using the oc get csvs command, but the copied CSVs from the openshift namespace are not visible in their namespaces. Operators affected by this limitation are still available and continue to reconcile events in the user’s namespace.

    To view a full list of installed global Operators, similar to the web console behavior, all authenticated users can run the following command:

    $ oc get csvs -n openshift

Procedure

  • Edit the OLMConfig object named cluster and set the spec.features.disableCopiedCSVs field to true:

    $ oc apply -f - <<EOF
    apiVersion: operators.coreos.com/v1
    kind: OLMConfig
    metadata:
      name: cluster
    spec:
      features:
        disableCopiedCSVs: true 1
    EOF
    1
    Disabled copied CSVs for AllNamespaces install mode Operators

Verification

  • When copied CSVs are disabled, OLM captures this information in an event in the Operator’s namespace:

    $ oc get events

    Example output

    LAST SEEN   TYPE      REASON               OBJECT                                MESSAGE
    85s         Warning   DisabledCopiedCSVs   clusterserviceversion/my-csv.v1.0.0   CSV copying disabled for operators/my-csv.v1.0.0

    When the spec.features.disableCopiedCSVs field is missing or set to false, OLM recreates the copied CSVs for all Operators installed with the AllNamespaces mode and deletes the previously mentioned events.

Additional resources

4.5. Configuring proxy support in Operator Lifecycle Manager

If a global proxy is configured on the OpenShift Container Platform cluster, Operator Lifecycle Manager (OLM) automatically configures Operators that it manages with the cluster-wide proxy. However, you can also configure installed Operators to override the global proxy or inject a custom CA certificate.

4.5.1. Overriding proxy settings of an Operator

If a cluster-wide egress proxy is configured, Operators running with Operator Lifecycle Manager (OLM) inherit the cluster-wide proxy settings on their deployments. Cluster administrators can also override these proxy settings by configuring the subscription of an Operator.

Important

Operators must handle setting environment variables for proxy settings in the pods for any managed Operands.

Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with cluster-admin permissions.

Procedure

  1. Navigate in the web console to the Operators → OperatorHub page.
  2. Select the Operator and click Install.
  3. On the Install Operator page, modify the Subscription object to include one or more of the following environment variables in the spec section:

    • HTTP_PROXY
    • HTTPS_PROXY
    • NO_PROXY

    For example:

    Subscription object with proxy setting overrides

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: etcd-config-test
      namespace: openshift-operators
    spec:
      config:
        env:
        - name: HTTP_PROXY
          value: test_http
        - name: HTTPS_PROXY
          value: test_https
        - name: NO_PROXY
          value: test
      channel: clusterwide-alpha
      installPlanApproval: Automatic
      name: etcd
      source: community-operators
      sourceNamespace: openshift-marketplace
      startingCSV: etcdoperator.v0.9.4-clusterwide

    Note

    These environment variables can also be unset using an empty value to remove any previously set cluster-wide or custom proxy settings.

    OLM handles these environment variables as a unit; if at least one of them is set, all three are considered overridden and the cluster-wide defaults are not used for the deployments of the subscribed Operator.

  4. Click Install to make the Operator available to the selected namespaces.
  5. After the CSV for the Operator appears in the relevant namespace, you can verify that custom proxy environment variables are set in the deployment. For example, using the CLI:

    $ oc get deployment -n openshift-operators \
        etcd-operator -o yaml \
        | grep -i "PROXY" -A 2

    Example output

            - name: HTTP_PROXY
              value: test_http
            - name: HTTPS_PROXY
              value: test_https
            - name: NO_PROXY
              value: test
            image: quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21088a98b93838e284a6086b13917f96b0d9c
    ...

4.5.2. Injecting a custom CA certificate

When a cluster administrator adds a custom CA certificate to a cluster using a config map, the Cluster Network Operator merges the user-provided certificates and system CA certificates into a single bundle. You can inject this merged bundle into your Operator running on Operator Lifecycle Manager (OLM), which is useful if you have a man-in-the-middle HTTPS proxy.

Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with cluster-admin permissions.
  • Custom CA certificate added to the cluster using a config map.
  • Desired Operator installed and running on OLM.

Procedure

  1. Create an empty config map in the namespace where the subscription for your Operator exists and include the following label:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: trusted-ca 1
      labels:
        config.openshift.io/inject-trusted-cabundle: "true" 2
    1
    Name of the config map.
    2
    Requests the Cluster Network Operator to inject the merged bundle.

    After creating this config map, it is immediately populated with the certificate contents of the merged bundle.

  2. Update the Subscription object to include a spec.config section that mounts the trusted-ca config map as a volume to each container within a pod that requires a custom CA:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: my-operator
    spec:
      package: etcd
      channel: alpha
      config: 1
        selector:
          matchLabels:
            <labels_for_pods> 2
        volumes: 3
        - name: trusted-ca
          configMap:
            name: trusted-ca
            items:
              - key: ca-bundle.crt 4
                path: tls-ca-bundle.pem 5
        volumeMounts: 6
        - name: trusted-ca
          mountPath: /etc/pki/ca-trust/extracted/pem
          readOnly: true
    1
    Add a config section if it does not exist.
    2
    Specify labels to match pods that are owned by the Operator.
    3
    Create a trusted-ca volume.
    4
    ca-bundle.crt is required as the config map key.
    5
    tls-ca-bundle.pem is required as the config map path.
    6
    Create a trusted-ca volume mount.
    Note

    Deployments of an Operator can fail to validate the authority and display a x509 certificate signed by unknown authority error. This error can occur even after injecting a custom CA when using the subscription of an Operator. In this case, you can set the mountPath as /etc/ssl/certs for trusted-ca by using the subscription of an Operator.

4.6. Viewing Operator status

Understanding the state of the system in Operator Lifecycle Manager (OLM) is important for making decisions about and debugging problems with installed Operators. OLM provides insight into subscriptions and related catalog sources regarding their state and actions performed. This helps users better understand the healthiness of their Operators.

4.6.1. Operator subscription condition types

Subscriptions can report the following condition types:

Table 4.1. Subscription condition types

ConditionDescription

CatalogSourcesUnhealthy

Some or all of the catalog sources to be used in resolution are unhealthy.

InstallPlanMissing

An install plan for a subscription is missing.

InstallPlanPending

An install plan for a subscription is pending installation.

InstallPlanFailed

An install plan for a subscription has failed.

ResolutionFailed

The dependency resolution for a subscription has failed.

Note

Default OpenShift Container Platform cluster Operators are managed by the Cluster Version Operator (CVO) and they do not have a Subscription object. Application Operators are managed by Operator Lifecycle Manager (OLM) and they have a Subscription object.

Additional resources

4.6.2. Viewing Operator subscription status by using the CLI

You can view Operator subscription status by using the CLI.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. List Operator subscriptions:

    $ oc get subs -n <operator_namespace>
  2. Use the oc describe command to inspect a Subscription resource:

    $ oc describe sub <subscription_name> -n <operator_namespace>
  3. In the command output, find the Conditions section for the status of Operator subscription condition types. In the following example, the CatalogSourcesUnhealthy condition type has a status of false because all available catalog sources are healthy:

    Example output

    Name:         cluster-logging
    Namespace:    openshift-logging
    Labels:       operators.coreos.com/cluster-logging.openshift-logging=
    Annotations:  <none>
    API Version:  operators.coreos.com/v1alpha1
    Kind:         Subscription
    # ...
    Conditions:
       Last Transition Time:  2019-07-29T13:42:57Z
       Message:               all available catalogsources are healthy
       Reason:                AllCatalogSourcesHealthy
       Status:                False
       Type:                  CatalogSourcesUnhealthy
    # ...

Note

Default OpenShift Container Platform cluster Operators are managed by the Cluster Version Operator (CVO) and they do not have a Subscription object. Application Operators are managed by Operator Lifecycle Manager (OLM) and they have a Subscription object.

4.6.3. Viewing Operator catalog source status by using the CLI

You can view the status of an Operator catalog source by using the CLI.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. List the catalog sources in a namespace. For example, you can check the openshift-marketplace namespace, which is used for cluster-wide catalog sources:

    $ oc get catalogsources -n openshift-marketplace

    Example output

    NAME                  DISPLAY               TYPE   PUBLISHER   AGE
    certified-operators   Certified Operators   grpc   Red Hat     55m
    community-operators   Community Operators   grpc   Red Hat     55m
    example-catalog       Example Catalog       grpc   Example Org 2m25s
    redhat-marketplace    Red Hat Marketplace   grpc   Red Hat     55m
    redhat-operators      Red Hat Operators     grpc   Red Hat     55m

  2. Use the oc describe command to get more details and status about a catalog source:

    $ oc describe catalogsource example-catalog -n openshift-marketplace

    Example output

    Name:         example-catalog
    Namespace:    openshift-marketplace
    Labels:       <none>
    Annotations:  operatorframework.io/managed-by: marketplace-operator
                  target.workload.openshift.io/management: {"effect": "PreferredDuringScheduling"}
    API Version:  operators.coreos.com/v1alpha1
    Kind:         CatalogSource
    # ...
    Status:
      Connection State:
        Address:              example-catalog.openshift-marketplace.svc:50051
        Last Connect:         2021-09-09T17:07:35Z
        Last Observed State:  TRANSIENT_FAILURE
      Registry Service:
        Created At:         2021-09-09T17:05:45Z
        Port:               50051
        Protocol:           grpc
        Service Name:       example-catalog
        Service Namespace:  openshift-marketplace
    # ...

    In the preceding example output, the last observed state is TRANSIENT_FAILURE. This state indicates that there is a problem establishing a connection for the catalog source.

  3. List the pods in the namespace where your catalog source was created:

    $ oc get pods -n openshift-marketplace

    Example output

    NAME                                    READY   STATUS             RESTARTS   AGE
    certified-operators-cv9nn               1/1     Running            0          36m
    community-operators-6v8lp               1/1     Running            0          36m
    marketplace-operator-86bfc75f9b-jkgbc   1/1     Running            0          42m
    example-catalog-bwt8z                   0/1     ImagePullBackOff   0          3m55s
    redhat-marketplace-57p8c                1/1     Running            0          36m
    redhat-operators-smxx8                  1/1     Running            0          36m

    When a catalog source is created in a namespace, a pod for the catalog source is created in that namespace. In the preceding example output, the status for the example-catalog-bwt8z pod is ImagePullBackOff. This status indicates that there is an issue pulling the catalog source’s index image.

  4. Use the oc describe command to inspect a pod for more detailed information:

    $ oc describe pod example-catalog-bwt8z -n openshift-marketplace

    Example output

    Name:         example-catalog-bwt8z
    Namespace:    openshift-marketplace
    Priority:     0
    Node:         ci-ln-jyryyg2-f76d1-ggdbq-worker-b-vsxjd/10.0.128.2
    ...
    Events:
      Type     Reason          Age                From               Message
      ----     ------          ----               ----               -------
      Normal   Scheduled       48s                default-scheduler  Successfully assigned openshift-marketplace/example-catalog-bwt8z to ci-ln-jyryyf2-f76d1-fgdbq-worker-b-vsxjd
      Normal   AddedInterface  47s                multus             Add eth0 [10.131.0.40/23] from openshift-sdn
      Normal   BackOff         20s (x2 over 46s)  kubelet            Back-off pulling image "quay.io/example-org/example-catalog:v1"
      Warning  Failed          20s (x2 over 46s)  kubelet            Error: ImagePullBackOff
      Normal   Pulling         8s (x3 over 47s)   kubelet            Pulling image "quay.io/example-org/example-catalog:v1"
      Warning  Failed          8s (x3 over 47s)   kubelet            Failed to pull image "quay.io/example-org/example-catalog:v1": rpc error: code = Unknown desc = reading manifest v1 in quay.io/example-org/example-catalog: unauthorized: access to the requested resource is not authorized
      Warning  Failed          8s (x3 over 47s)   kubelet            Error: ErrImagePull

    In the preceding example output, the error messages indicate that the catalog source’s index image is failing to pull successfully because of an authorization issue. For example, the index image might be stored in a registry that requires login credentials.

4.7. Managing Operator conditions

As a cluster administrator, you can manage Operator conditions by using Operator Lifecycle Manager (OLM).

4.7.1. Overriding Operator conditions

As a cluster administrator, you might want to ignore a supported Operator condition reported by an Operator. When present, Operator conditions in the Spec.Overrides array override the conditions in the Spec.Conditions array, allowing cluster administrators to deal with situations where an Operator is incorrectly reporting a state to Operator Lifecycle Manager (OLM).

Note

By default, the Spec.Overrides array is not present in an OperatorCondition object until it is added by a cluster administrator . The Spec.Conditions array is also not present until it is either added by a user or as a result of custom Operator logic.

For example, consider a known version of an Operator that always communicates that it is not upgradeable. In this instance, you might want to upgrade the Operator despite the Operator communicating that it is not upgradeable. This could be accomplished by overriding the Operator condition by adding the condition type and status to the Spec.Overrides array in the OperatorCondition object.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • An Operator with an OperatorCondition object, installed using OLM.

Procedure

  1. Edit the OperatorCondition object for the Operator:

    $ oc edit operatorcondition <name>
  2. Add a Spec.Overrides array to the object:

    Example Operator condition override

    apiVersion: operators.coreos.com/v1
    kind: OperatorCondition
    metadata:
      name: my-operator
      namespace: operators
    spec:
      overrides:
      - type: Upgradeable 1
        status: "True"
        reason: "upgradeIsSafe"
        message: "This is a known issue with the Operator where it always reports that it cannot be upgraded."
      conditions:
      - type: Upgradeable
        status: "False"
        reason: "migration"
        message: "The operator is performing a migration."
        lastTransitionTime: "2020-08-24T23:15:55Z"

    1
    Allows the cluster administrator to change the upgrade readiness to True.

4.7.2. Updating your Operator to use Operator conditions

Operator Lifecycle Manager (OLM) automatically creates an OperatorCondition resource for each ClusterServiceVersion resource that it reconciles. All service accounts in the CSV are granted the RBAC to interact with the OperatorCondition owned by the Operator.

An Operator author can develop their Operator to use the operator-lib library such that, after the Operator has been deployed by OLM, it can set its own conditions. For more resources about setting Operator conditions as an Operator author, see the Enabling Operator conditions page.

4.7.2.1. Setting defaults

In an effort to remain backwards compatible, OLM treats the absence of an OperatorCondition resource as opting out of the condition. Therefore, an Operator that opts in to using Operator conditions should set default conditions before the ready probe for the pod is set to true. This provides the Operator with a grace period to update the condition to the correct state.

4.7.3. Additional resources

4.8. Allowing non-cluster administrators to install Operators

Cluster administrators can use Operator groups to allow regular users to install Operators.

Additional resources

4.8.1. Understanding Operator installation policy

Operators can require wide privileges to run, and the required privileges can change between versions. Operator Lifecycle Manager (OLM) runs with cluster-admin privileges. By default, Operator authors can specify any set of permissions in the cluster service version (CSV), and OLM consequently grants it to the Operator.

To ensure that an Operator cannot achieve cluster-scoped privileges and that users cannot escalate privileges using OLM, Cluster administrators can manually audit Operators before they are added to the cluster. Cluster administrators are also provided tools for determining and constraining which actions are allowed during an Operator installation or upgrade using service accounts.

Cluster administrators can associate an Operator group with a service account that has a set of privileges granted to it. The service account sets policy on Operators to ensure they only run within predetermined boundaries by using role-based access control (RBAC) rules. As a result, the Operator is unable to do anything that is not explicitly permitted by those rules.

By employing Operator groups, users with enough privileges can install Operators with a limited scope. As a result, more of the Operator Framework tools can safely be made available to more users, providing a richer experience for building applications with Operators.

Note

Role-based access control (RBAC) for Subscription objects is automatically granted to every user with the edit or admin role in a namespace. However, RBAC does not exist on OperatorGroup objects; this absence is what prevents regular users from installing Operators. Preinstalling Operator groups is effectively what gives installation privileges.

Keep the following points in mind when associating an Operator group with a service account:

  • The APIService and CustomResourceDefinition resources are always created by OLM using the cluster-admin role. A service account associated with an Operator group should never be granted privileges to write these resources.
  • Any Operator tied to this Operator group is now confined to the permissions granted to the specified service account. If the Operator asks for permissions that are outside the scope of the service account, the install fails with appropriate errors so the cluster administrator can troubleshoot and resolve the issue.

4.8.1.1. Installation scenarios

When determining whether an Operator can be installed or upgraded on a cluster, Operator Lifecycle Manager (OLM) considers the following scenarios:

  • A cluster administrator creates a new Operator group and specifies a service account. All Operator(s) associated with this Operator group are installed and run against the privileges granted to the service account.
  • A cluster administrator creates a new Operator group and does not specify any service account. OpenShift Container Platform maintains backward compatibility, so the default behavior remains and Operator installs and upgrades are permitted.
  • For existing Operator groups that do not specify a service account, the default behavior remains and Operator installs and upgrades are permitted.
  • A cluster administrator updates an existing Operator group and specifies a service account. OLM allows the existing Operator to continue to run with their current privileges. When such an existing Operator is going through an upgrade, it is reinstalled and run against the privileges granted to the service account like any new Operator.
  • A service account specified by an Operator group changes by adding or removing permissions, or the existing service account is swapped with a new one. When existing Operators go through an upgrade, it is reinstalled and run against the privileges granted to the updated service account like any new Operator.
  • A cluster administrator removes the service account from an Operator group. The default behavior remains and Operator installs and upgrades are permitted.

4.8.1.2. Installation workflow

When an Operator group is tied to a service account and an Operator is installed or upgraded, Operator Lifecycle Manager (OLM) uses the following workflow:

  1. The given Subscription object is picked up by OLM.
  2. OLM fetches the Operator group tied to this subscription.
  3. OLM determines that the Operator group has a service account specified.
  4. OLM creates a client scoped to the service account and uses the scoped client to install the Operator. This ensures that any permission requested by the Operator is always confined to that of the service account in the Operator group.
  5. OLM creates a new service account with the set of permissions specified in the CSV and assigns it to the Operator. The Operator runs as the assigned service account.

4.8.2. Scoping Operator installations

To provide scoping rules to Operator installations and upgrades on Operator Lifecycle Manager (OLM), associate a service account with an Operator group.

Using this example, a cluster administrator can confine a set of Operators to a designated namespace.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Create a new namespace:

    $ cat <<EOF | oc create -f -
    apiVersion: v1
    kind: Namespace
    metadata:
      name: scoped
    EOF
  2. Allocate permissions that you want the Operator(s) to be confined to. This involves creating a new service account, relevant role(s), and role binding(s).

    $ cat <<EOF | oc create -f -
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: scoped
      namespace: scoped
    EOF

    The following example grants the service account permissions to do anything in the designated namespace for simplicity. In a production environment, you should create a more fine-grained set of permissions:

    $ cat <<EOF | oc create -f -
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: scoped
      namespace: scoped
    rules:
    - apiGroups: ["*"]
      resources: ["*"]
      verbs: ["*"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: scoped-bindings
      namespace: scoped
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: scoped
    subjects:
    - kind: ServiceAccount
      name: scoped
      namespace: scoped
    EOF
  3. Create an OperatorGroup object in the designated namespace. This Operator group targets the designated namespace to ensure that its tenancy is confined to it.

    In addition, Operator groups allow a user to specify a service account. Specify the service account created in the previous step:

    $ cat <<EOF | oc create -f -
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: scoped
      namespace: scoped
    spec:
      serviceAccountName: scoped
      targetNamespaces:
      - scoped
    EOF

    Any Operator installed in the designated namespace is tied to this Operator group and therefore to the service account specified.

    Warning

    Operator Lifecycle Manager (OLM) creates the following cluster roles for each Operator group:

    • <operatorgroup_name>-admin
    • <operatorgroup_name>-edit
    • <operatorgroup_name>-view

    When you manually create an Operator group, you must specify a unique name that does not conflict with the existing cluster roles or other Operator groups on the cluster.

  4. Create a Subscription object in the designated namespace to install an Operator:

    $ cat <<EOF | oc create -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: etcd
      namespace: scoped
    spec:
      channel: singlenamespace-alpha
      name: etcd
      source: <catalog_source_name> 1
      sourceNamespace: <catalog_source_namespace> 2
    EOF
    1
    Specify a catalog source that already exists in the designated namespace or one that is in the global catalog namespace.
    2
    Specify a namespace where the catalog source was created.

    Any Operator tied to this Operator group is confined to the permissions granted to the specified service account. If the Operator requests permissions that are outside the scope of the service account, the installation fails with relevant errors.

4.8.2.1. Fine-grained permissions

Operator Lifecycle Manager (OLM) uses the service account specified in an Operator group to create or update the following resources related to the Operator being installed:

  • ClusterServiceVersion
  • Subscription
  • Secret
  • ServiceAccount
  • Service
  • ClusterRole and ClusterRoleBinding
  • Role and RoleBinding

To confine Operators to a designated namespace, cluster administrators can start by granting the following permissions to the service account:

Note

The following role is a generic example and additional rules might be required based on the specific Operator.

kind: Role
rules:
- apiGroups: ["operators.coreos.com"]
  resources: ["subscriptions", "clusterserviceversions"]
  verbs: ["get", "create", "update", "patch"]
- apiGroups: [""]
  resources: ["services", "serviceaccounts"]
  verbs: ["get", "create", "update", "patch"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["roles", "rolebindings"]
  verbs: ["get", "create", "update", "patch"]
- apiGroups: ["apps"] 1
  resources: ["deployments"]
  verbs: ["list", "watch", "get", "create", "update", "patch", "delete"]
- apiGroups: [""] 2
  resources: ["pods"]
  verbs: ["list", "watch", "get", "create", "update", "patch", "delete"]
1 2
Add permissions to create other resources, such as deployments and pods shown here.

In addition, if any Operator specifies a pull secret, the following permissions must also be added:

kind: ClusterRole 1
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
---
kind: Role
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["create", "update", "patch"]
1
Required to get the secret from the OLM namespace.

4.8.3. Operator catalog access control

When an Operator catalog is created in the global catalog namespace openshift-marketplace, the catalog’s Operators are made available cluster-wide to all namespaces. A catalog created in other namespaces only makes its Operators available in that same namespace of the catalog.

On clusters where non-cluster administrator users have been delegated Operator installation privileges, cluster administrators might want to further control or restrict the set of Operators those users are allowed to install. This can be achieved with the following actions:

  1. Disable all of the default global catalogs.
  2. Enable custom, curated catalogs in the same namespace where the relevant Operator groups have been preinstalled.

4.8.4. Troubleshooting permission failures

If an Operator installation fails due to lack of permissions, identify the errors using the following procedure.

Procedure

  1. Review the Subscription object. Its status has an object reference installPlanRef that points to the InstallPlan object that attempted to create the necessary [Cluster]Role[Binding] object(s) for the Operator:

    apiVersion: operators.coreos.com/v1
    kind: Subscription
    metadata:
      name: etcd
      namespace: scoped
    status:
      installPlanRef:
        apiVersion: operators.coreos.com/v1
        kind: InstallPlan
        name: install-4plp8
        namespace: scoped
        resourceVersion: "117359"
        uid: 2c1df80e-afea-11e9-bce3-5254009c9c23
  2. Check the status of the InstallPlan object for any errors:

    apiVersion: operators.coreos.com/v1
    kind: InstallPlan
    status:
      conditions:
      - lastTransitionTime: "2019-07-26T21:13:10Z"
        lastUpdateTime: "2019-07-26T21:13:10Z"
        message: 'error creating clusterrole etcdoperator.v0.9.4-clusterwide-dsfx4: clusterroles.rbac.authorization.k8s.io
          is forbidden: User "system:serviceaccount:scoped:scoped" cannot create resource
          "clusterroles" in API group "rbac.authorization.k8s.io" at the cluster scope'
        reason: InstallComponentFailed
        status: "False"
        type: Installed
      phase: Failed

    The error message tells you:

    • The type of resource it failed to create, including the API group of the resource. In this case, it was clusterroles in the rbac.authorization.k8s.io group.
    • The name of the resource.
    • The type of error: is forbidden tells you that the user does not have enough permission to do the operation.
    • The name of the user who attempted to create or update the resource. In this case, it refers to the service account specified in the Operator group.
    • The scope of the operation: cluster scope or not.

      The user can add the missing permission to the service account and then iterate.

      Note

      Operator Lifecycle Manager (OLM) does not currently provide the complete list of errors on the first try.

4.9. Managing custom catalogs

Cluster administrators and Operator catalog maintainers can create and manage custom catalogs packaged using the bundle format on Operator Lifecycle Manager (OLM) in OpenShift Container Platform.

Important

Kubernetes periodically deprecates certain APIs that are removed in subsequent releases. As a result, Operators are unable to use removed APIs starting with the version of OpenShift Container Platform that uses the Kubernetes version that removed the API.

If your cluster is using custom catalogs, see Controlling Operator compatibility with OpenShift Container Platform versions for more details about how Operator authors can update their projects to help avoid workload issues and prevent incompatible upgrades.

4.9.1. Prerequisites

4.9.2. File-based catalogs

File-based catalogs are the latest iteration of the catalog format in Operator Lifecycle Manager (OLM). It is a plain text-based (JSON or YAML) and declarative config evolution of the earlier SQLite database format, and it is fully backwards compatible.

Important

As of OpenShift Container Platform 4.11, the default Red Hat-provided Operator catalog releases in the file-based catalog format. The default Red Hat-provided Operator catalogs for OpenShift Container Platform 4.6 through 4.10 released in the deprecated SQLite database format.

The opm subcommands, flags, and functionality related to the SQLite database format are also deprecated and will be removed in a future release. The features are still supported and must be used for catalogs that use the deprecated SQLite database format.

Many of the opm subcommands and flags for working with the SQLite database format, such as opm index prune, do not work with the file-based catalog format. For more information about working with file-based catalogs, see Operator Framework packaging format and Mirroring images for a disconnected installation using the oc-mirror plugin.

4.9.2.1. Creating a file-based catalog image

You can use the opm CLI to create a catalog image that uses the plain text file-based catalog format (JSON or YAML), which replaces the deprecated SQLite database format.

Prerequisites

  • You have installed the opm CLI.
  • You have podman version 1.9.3+.
  • A bundle image is built and pushed to a registry that supports Docker v2-2.

Procedure

  1. Initialize the catalog:

    1. Create a directory for the catalog by running the following command:

      $ mkdir <catalog_dir>
    2. Generate a Dockerfile that can build a catalog image by running the opm generate dockerfile command:

      $ opm generate dockerfile <catalog_dir> \
          -i registry.redhat.io/openshift4/ose-operator-registry:v4.13 1
      1
      Specify the official Red Hat base image by using the -i flag, otherwise the Dockerfile uses the default upstream image.

      The Dockerfile must be in the same parent directory as the catalog directory that you created in the previous step:

      Example directory structure

      . 1
      ├── <catalog_dir> 2
      └── <catalog_dir>.Dockerfile 3

      1
      Parent directory
      2
      Catalog directory
      3
      Dockerfile generated by the opm generate dockerfile command
    3. Populate the catalog with the package definition for your Operator by running the opm init command:

      $ opm init <operator_name> \ 1
          --default-channel=preview \ 2
          --description=./README.md \ 3
          --icon=./operator-icon.svg \ 4
          --output yaml \ 5
          > <catalog_dir>/index.yaml 6
      1
      Operator, or package, name
      2
      Channel that subscriptions default to if unspecified
      3
      Path to the Operator’s README.md or other documentation
      4
      Path to the Operator’s icon
      5
      Output format: JSON or YAML
      6
      Path for creating the catalog configuration file

      This command generates an olm.package declarative config blob in the specified catalog configuration file.

  2. Add a bundle to the catalog by running the opm render command:

    $ opm render <registry>/<namespace>/<bundle_image_name>:<tag> \ 1
        --output=yaml \
        >> <catalog_dir>/index.yaml 2
    1
    Pull spec for the bundle image
    2
    Path to the catalog configuration file
    Note

    Channels must contain at least one bundle.

  3. Add a channel entry for the bundle. For example, modify the following example to your specifications, and add it to your <catalog_dir>/index.yaml file:

    Example channel entry

    ---
    schema: olm.channel
    package: <operator_name>
    name: preview
    entries:
      - name: <operator_name>.v0.1.0 1

    1
    Ensure that you include the period (.) after <operator_name> but before the v in the version. Otherwise, the entry fails to pass the opm validate command.
  4. Validate the file-based catalog:

    1. Run the opm validate command against the catalog directory:

      $ opm validate <catalog_dir>
    2. Check that the error code is 0:

      $ echo $?

      Example output

      0

  5. Build the catalog image by running the podman build command:

    $ podman build . \
        -f <catalog_dir>.Dockerfile \
        -t <registry>/<namespace>/<catalog_image_name>:<tag>
  6. Push the catalog image to a registry:

    1. If required, authenticate with your target registry by running the podman login command:

      $ podman login <registry>
    2. Push the catalog image by running the podman push command:

      $ podman push <registry>/<namespace>/<catalog_image_name>:<tag>

Additional resources

4.9.2.2. Updating or filtering a file-based catalog image

You can use the opm CLI to update or filter (also known as prune) a catalog image that uses the file-based catalog format. By extracting and modifying the contents of an existing catalog image, you can update, add, or remove one or more Operator package entries from the catalog. You can then rebuild the image as an updated version of the catalog.

Note

Alternatively, if you already have a catalog image on a mirror registry, you can use the oc-mirror CLI plugin to automatically prune any removed images from an updated source version of that catalog image while mirroring it to the target registry.

For more information about the oc-mirror plugin and this use case, see the "Keeping your mirror registry content updated" section, and specifically the "Pruning images" subsection, of "Mirroring images for a disconnected installation using the oc-mirror plugin".

Prerequisites

  • You have the following on your workstation:

    • The opm CLI.
    • podman version 1.9.3+.
    • A file-based catalog image.
    • A catalog directory structure recently initialized on your workstation related to this catalog.

      If you do not have an initialized catalog directory, create the directory and generate the Dockerfile. For more information, see the "Initialize the catalog" step from the "Creating a file-based catalog image" procedure.

Procedure

  1. Extract the contents of the catalog image in YAML format to an index.yaml file in your catalog directory:

    $ opm render <registry>/<namespace>/<catalog_image_name>:<tag> \
        -o yaml > <catalog_dir>/index.yaml
    Note

    Alternatively, you can use the -o json flag to output in JSON format.

  2. Modify the contents of the resulting index.yaml file to your specifications by updating, adding, or removing one or more Operator package entries.

    Important

    After a bundle has been published in a catalog, assume that one of your users has installed it. Ensure that all previously published bundles in a catalog have an update path to the current or newer channel head to avoid stranding users that have that version installed.

    For example, if you wanted to remove an Operator package, the following example lists a set of olm.package, olm.channel, and olm.bundle blobs which must be deleted to remove the package from the catalog:

    Example 4.2. Example removed entries

    ---
    defaultChannel: release-2.7
    icon:
      base64data: <base64_string>
      mediatype: image/svg+xml
    name: example-operator
    schema: olm.package
    ---
    entries:
    - name: example-operator.v2.7.0
      skipRange: '>=2.6.0 <2.7.0'
    - name: example-operator.v2.7.1
      replaces: example-operator.v2.7.0
      skipRange: '>=2.6.0 <2.7.1'
    - name: example-operator.v2.7.2
      replaces: example-operator.v2.7.1
      skipRange: '>=2.6.0 <2.7.2'
    - name: example-operator.v2.7.3
      replaces: example-operator.v2.7.2
      skipRange: '>=2.6.0 <2.7.3'
    - name: example-operator.v2.7.4
      replaces: example-operator.v2.7.3
      skipRange: '>=2.6.0 <2.7.4'
    name: release-2.7
    package: example-operator
    schema: olm.channel
    ---
    image: example.com/example-inc/example-operator-bundle@sha256:<digest>
    name: example-operator.v2.7.0
    package: example-operator
    properties:
    - type: olm.gvk
      value:
        group: example-group.example.io
        kind: MyObject
        version: v1alpha1
    - type: olm.gvk
      value:
        group: example-group.example.io
        kind: MyOtherObject
        version: v1beta1
    - type: olm.package
      value:
        packageName: example-operator
        version: 2.7.0
    - type: olm.bundle.object
      value:
        data: <base64_string>
    - type: olm.bundle.object
      value:
        data: <base64_string>
    relatedImages:
    - image: example.com/example-inc/example-related-image@sha256:<digest>
      name: example-related-image
    schema: olm.bundle
    ---
  3. Save your changes to the index.yaml file.
  4. Validate the catalog:

    $ opm validate <catalog_dir>
  5. Rebuild the catalog:

    $ podman build . \
        -f <catalog_dir>.Dockerfile \
        -t <registry>/<namespace>/<catalog_image_name>:<tag>
  6. Push the updated catalog image to a registry:

    $ podman push <registry>/<namespace>/<catalog_image_name>:<tag>

Verification

  1. In the web console, navigate to the OperatorHub configuration resource in the AdministrationCluster SettingsConfiguration page.
  2. Add the catalog source or update the existing catalog source to use the pull spec for your updated catalog image.

    For more information, see "Adding a catalog source to a cluster" in the "Additional resources" of this section.

  3. After the catalog source is in a READY state, navigate to the OperatorsOperatorHub page and check that the changes you made are reflected in the list of Operators.

4.9.3. SQLite-based catalogs

Important

The SQLite database format for Operator catalogs is a deprecated feature. Deprecated functionality is still included in OpenShift Container Platform and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

For the most recent list of major functionality that has been deprecated or removed within OpenShift Container Platform, refer to the Deprecated and removed features section of the OpenShift Container Platform release notes.

4.9.3.1. Creating a SQLite-based index image

You can create an index image based on the SQLite database format by using the opm CLI.

Prerequisites

  • You have installed the opm CLI.
  • You have podman version 1.9.3+.
  • A bundle image is built and pushed to a registry that supports Docker v2-2.

Procedure

  1. Start a new index:

    $ opm index add \
        --bundles <registry>/<namespace>/<bundle_image_name>:<tag> \1
        --tag <registry>/<namespace>/<index_image_name>:<tag> \2
        [--binary-image <registry_base_image>] 3
    1
    Comma-separated list of bundle images to add to the index.
    2
    The image tag that you want the index image to have.
    3
    Optional: An alternative registry base image to use for serving the catalog.
  2. Push the index image to a registry.

    1. If required, authenticate with your target registry:

      $ podman login <registry>
    2. Push the index image:

      $ podman push <registry>/<namespace>/<index_image_name>:<tag>

4.9.3.2. Updating a SQLite-based index image

After configuring OperatorHub to use a catalog source that references a custom index image, cluster administrators can keep the available Operators on their cluster up-to-date by adding bundle images to the index image.

You can update an existing index image using the opm index add command.

Prerequisites

  • You have installed the opm CLI.
  • You have podman version 1.9.3+.
  • An index image is built and pushed to a registry.
  • You have an existing catalog source referencing the index image.

Procedure

  1. Update the existing index by adding bundle images:

    $ opm index add \
        --bundles <registry>/<namespace>/<new_bundle_image>@sha256:<digest> \1
        --from-index <registry>/<namespace>/<existing_index_image>:<existing_tag> \2
        --tag <registry>/<namespace>/<existing_index_image>:<updated_tag> \3
        --pull-tool podman 4
    1
    The --bundles flag specifies a comma-separated list of additional bundle images to add to the index.
    2
    The --from-index flag specifies the previously pushed index.
    3
    The --tag flag specifies the image tag to apply to the updated index image.
    4
    The --pull-tool flag specifies the tool used to pull container images.

    where:

    <registry>
    Specifies the hostname of the registry, such as quay.io or mirror.example.com.
    <namespace>
    Specifies the namespace of the registry, such as ocs-dev or abc.
    <new_bundle_image>
    Specifies the new bundle image to add to the registry, such as ocs-operator.
    <digest>
    Specifies the SHA image ID, or digest, of the bundle image, such as c7f11097a628f092d8bad148406aa0e0951094a03445fd4bc0775431ef683a41.
    <existing_index_image>
    Specifies the previously pushed image, such as abc-redhat-operator-index.
    <existing_tag>
    Specifies a previously pushed image tag, such as 4.13.
    <updated_tag>
    Specifies the image tag to apply to the updated index image, such as 4.13.1.

    Example command

    $ opm index add \
        --bundles quay.io/ocs-dev/ocs-operator@sha256:c7f11097a628f092d8bad148406aa0e0951094a03445fd4bc0775431ef683a41 \
        --from-index mirror.example.com/abc/abc-redhat-operator-index:4.13 \
        --tag mirror.example.com/abc/abc-redhat-operator-index:4.13.1 \
        --pull-tool podman

  2. Push the updated index image:

    $ podman push <registry>/<namespace>/<existing_index_image>:<updated_tag>
  3. After Operator Lifecycle Manager (OLM) automatically polls the index image referenced in the catalog source at its regular interval, verify that the new packages are successfully added:

    $ oc get packagemanifests -n openshift-marketplace

4.9.3.3. Filtering a SQLite-based index image

An index image, based on the Operator bundle format, is a containerized snapshot of an Operator catalog. You can filter, or prune, an index of all but a specified list of packages, which creates a copy of the source index containing only the Operators that you want.

Prerequisites

  • You have podman version 1.9.3+.
  • You have grpcurl (third-party command-line tool).
  • You have installed the opm CLI.
  • You have access to a registry that supports Docker v2-2.

Procedure

  1. Authenticate with your target registry:

    $ podman login <target_registry>
  2. Determine the list of packages you want to include in your pruned index.

    1. Run the source index image that you want to prune in a container. For example:

      $ podman run -p50051:50051 \
          -it registry.redhat.io/redhat/redhat-operator-index:v4.13

      Example output

      Trying to pull registry.redhat.io/redhat/redhat-operator-index:v4.13...
      Getting image source signatures
      Copying blob ae8a0c23f5b1 done
      ...
      INFO[0000] serving registry                              database=/database/index.db port=50051

    2. In a separate terminal session, use the grpcurl command to get a list of the packages provided by the index:

      $ grpcurl -plaintext localhost:50051 api.Registry/ListPackages > packages.out
    3. Inspect the packages.out file and identify which package names from this list you want to keep in your pruned index. For example:

      Example snippets of packages list

      ...
      {
        "name": "advanced-cluster-management"
      }
      ...
      {
        "name": "jaeger-product"
      }
      ...
      {
      {
        "name": "quay-operator"
      }
      ...

    4. In the terminal session where you executed the podman run command, press Ctrl and C to stop the container process.
  3. Run the following command to prune the source index of all but the specified packages:

    $ opm index prune \
        -f registry.redhat.io/redhat/redhat-operator-index:v4.13 \1
        -p advanced-cluster-management,jaeger-product,quay-operator \2
        [-i registry.redhat.io/openshift4/ose-operator-registry:v4.9] \3
        -t <target_registry>:<port>/<namespace>/redhat-operator-index:v4.13 4
    1
    Index to prune.
    2
    Comma-separated list of packages to keep.
    3
    Required only for IBM Power and IBM Z images: Operator Registry base image with the tag that matches the target OpenShift Container Platform cluster major and minor version.
    4
    Custom tag for new index image being built.
  4. Run the following command to push the new index image to your target registry:

    $ podman push <target_registry>:<port>/<namespace>/redhat-operator-index:v4.13

    where <namespace> is any existing namespace on the registry.

4.9.4. Catalog sources and pod security admission

Pod security admission was introduced in OpenShift Container Platform 4.11 to ensure pod security standards. Catalog sources built using the SQLite-based catalog format and a version of the opm CLI tool released before OpenShift Container Platform 4.11 cannot run under restricted pod security enforcement.

In OpenShift Container Platform 4.13, namespaces do not have restricted pod security enforcement by default and the default catalog source security mode is set to legacy.

Default restricted enforcement for all namespaces is planned for inclusion in a future OpenShift Container Platform release. When restricted enforcement occurs, the security context of the pod specification for catalog source pods must match the restricted pod security standard. If your catalog source image requires a different pod security standard, the pod security admissions label for the namespace must be explicitly set.

Note

If you do not want to run your SQLite-based catalog source pods as restricted, you do not need to update your catalog source in OpenShift Container Platform 4.13.

However, it is recommended that you take action now to ensure your catalog sources run under restricted pod security enforcement. If you do not take action to ensure your catalog sources run under restricted pod security enforcement, your catalog sources might not run in future OpenShift Container Platform releases.

As a catalog author, you can enable compatibility with restricted pod security enforcement by completing either of the following actions:

  • Migrate your catalog to the file-based catalog format.
  • Update your catalog image with a version of the opm CLI tool released with OpenShift Container Platform 4.11 or later.
Note

The SQLite database catalog format is deprecated, but still supported by Red Hat. In a future release, the SQLite database format will not be supported, and catalogs will need to migrate to the file-based catalog format. As of OpenShift Container Platform 4.11, the default Red Hat-provided Operator catalog is released in the file-based catalog format. File-based catalogs are compatible with restricted pod security enforcement.

If you do not want to update your SQLite database catalog image or migrate your catalog to the file-based catalog format, you can configure your catalog to run with elevated permissions.

4.9.4.1. Migrating SQLite database catalogs to the file-based catalog format

You can update your deprecated SQLite database format catalogs to the file-based catalog format.

Prerequisites

  • You have a SQLite database catalog source.
  • You have access to the cluster as a user with the cluster-admin role.
  • You have the latest version of the opm CLI tool released with OpenShift Container Platform 4.13 on your workstation.

Procedure

  1. Migrate your SQLite database catalog to a file-based catalog by running the following command:

    $ opm migrate <registry_image> <fbc_directory>
  2. Generate a Dockerfile for your file-based catalog by running the following command:

    $ opm generate dockerfile <fbc_directory> \
      --binary-image \
      registry.redhat.io/openshift4/ose-operator-registry:v4.13

Next steps

  • The generated Dockerfile can be built, tagged, and pushed to your registry.

4.9.4.2. Rebuilding SQLite database catalog images

You can rebuild your SQLite database catalog image with the latest version of the opm CLI tool that is released with your version of OpenShift Container Platform.

Prerequisites

  • You have a SQLite database catalog source.
  • You have access to the cluster as a user with the cluster-admin role.
  • You have the latest version of the opm CLI tool released with OpenShift Container Platform 4.13 on your workstation.

Procedure

  • Run the following command to rebuild your catalog with a more recent version of the opm CLI tool:

    $ opm index add --binary-image \
      registry.redhat.io/openshift4/ose-operator-registry:v4.13 \
      --from-index <your_registry_image> \
      --bundles "" -t \<your_registry_image>

4.9.4.3. Configuring catalogs to run with elevated permissions

If you do not want to update your SQLite database catalog image or migrate your catalog to the file-based catalog format, you can perform the following actions to ensure your catalog source runs when the default pod security enforcement changes to restricted:

  • Manually set the catalog security mode to legacy in your catalog source definition. This action ensures your catalog runs with legacy permissions even if the default catalog security mode changes to restricted.
  • Label the catalog source namespace for baseline or privileged pod security enforcement.
Note

The SQLite database catalog format is deprecated, but still supported by Red Hat. In a future release, the SQLite database format will not be supported, and catalogs will need to migrate to the file-based catalog format. File-based catalogs are compatible with restricted pod security enforcement.

Prerequisites

  • You have a SQLite database catalog source.
  • You have access to the cluster as a user with the cluster-admin role.
  • You have a target namespace that supports running pods with the elevated pod security admission standard of baseline or privileged.

Procedure

  1. Edit the CatalogSource definition by setting the spec.grpcPodConfig.securityContextConfig label to legacy, as shown in the following example:

    Example CatalogSource definition

    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: my-catsrc
      namespace: my-ns
    spec:
      sourceType: grpc
      grpcPodConfig:
        securityContextConfig: legacy
      image: my-image:latest

    Tip

    In OpenShift Container Platform 4.13, the spec.grpcPodConfig.securityContextConfig field is set to legacy by default. In a future release of OpenShift Container Platform, it is planned that the default setting will change to restricted. If your catalog cannot run under restricted enforcement, it is recommended that you manually set this field to legacy.

  2. Edit your <namespace>.yaml file to add elevated pod security admission standards to your catalog source namespace, as shown in the following example:

    Example <namespace>.yaml file

    apiVersion: v1
    kind: Namespace
    metadata:
    ...
      labels:
        security.openshift.io/scc.podSecurityLabelSync: "false" 1
        openshift.io/cluster-monitoring: "true"
        pod-security.kubernetes.io/enforce: baseline 2
      name: "<namespace_name>"

    1
    Turn off pod security label synchronization by adding the security.openshift.io/scc.podSecurityLabelSync=false label to the namespace.
    2
    Apply the pod security admission pod-security.kubernetes.io/enforce label. Set the label to baseline or privileged. Use the baseline pod security profile unless other workloads in the namespace require a privileged profile.

4.9.5. Adding a catalog source to a cluster

Adding a catalog source to an OpenShift Container Platform cluster enables the discovery and installation of Operators for users. Cluster administrators can create a CatalogSource object that references an index image. OperatorHub uses catalog sources to populate the user interface.

Tip

Alternatively, you can use the web console to manage catalog sources. From the AdministrationCluster SettingsConfigurationOperatorHub page, click the Sources tab, where you can create, update, delete, disable, and enable individual sources.

Prerequisites

  • You built and pushed an index image to a registry.
  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Create a CatalogSource object that references your index image.

    1. Modify the following to your specifications and save it as a catalogSource.yaml file:

      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      metadata:
        name: my-operator-catalog
        namespace: openshift-marketplace 1
        annotations:
          olm.catalogImageTemplate: 2
            "<registry>/<namespace>/<index_image_name>:v{kube_major_version}.{kube_minor_version}.{kube_patch_version}"
      spec:
        sourceType: grpc
        grpcPodConfig:
          securityContextConfig: <security_mode> 3
        image: <registry>/<namespace>/<index_image_name>:<tag> 4
        displayName: My Operator Catalog
        publisher: <publisher_name> 5
        updateStrategy:
          registryPoll: 6
            interval: 30m
      1
      If you want the catalog source to be available globally to users in all namespaces, specify the openshift-marketplace namespace. Otherwise, you can specify a different namespace for the catalog to be scoped and available only for that namespace.
      2
      Optional: Set the olm.catalogImageTemplate annotation to your index image name and use one or more of the Kubernetes cluster version variables as shown when constructing the template for the image tag.
      3
      Specify the value of legacy or restricted. If the field is not set, the default value is legacy. In a future OpenShift Container Platform release, it is planned that the default value will be restricted. If your catalog cannot run with restricted permissions, it is recommended that you manually set this field to legacy.
      4
      Specify your index image. If you specify a tag after the image name, for example :v4.13, the catalog source pod uses an image pull policy of Always, meaning the pod always pulls the image prior to starting the container. If you specify a digest, for example @sha256:<id>, the image pull policy is IfNotPresent, meaning the pod pulls the image only if it does not already exist on the node.
      5
      Specify your name or an organization name publishing the catalog.
      6
      Catalog sources can automatically check for new versions to keep up to date.
    2. Use the file to create the CatalogSource object:

      $ oc apply -f catalogSource.yaml
  2. Verify the following resources are created successfully.

    1. Check the pods:

      $ oc get pods -n openshift-marketplace

      Example output

      NAME                                    READY   STATUS    RESTARTS  AGE
      my-operator-catalog-6njx6               1/1     Running   0         28s
      marketplace-operator-d9f549946-96sgr    1/1     Running   0         26h

    2. Check the catalog source:

      $ oc get catalogsource -n openshift-marketplace

      Example output

      NAME                  DISPLAY               TYPE PUBLISHER  AGE
      my-operator-catalog   My Operator Catalog   grpc            5s

    3. Check the package manifest:

      $ oc get packagemanifest -n openshift-marketplace

      Example output

      NAME                          CATALOG               AGE
      jaeger-product                My Operator Catalog   93s

You can now install the Operators from the OperatorHub page on your OpenShift Container Platform web console.

4.9.6. Accessing images for Operators from private registries

If certain images relevant to Operators managed by Operator Lifecycle Manager (OLM) are hosted in an authenticated container image registry, also known as a private registry, OLM and OperatorHub are unable to pull the images by default. To enable access, you can create a pull secret that contains the authentication credentials for the registry. By referencing one or more pull secrets in a catalog source, OLM can handle placing the secrets in the Operator and catalog namespace to allow installation.

Other images required by an Operator or its Operands might require access to private registries as well. OLM does not handle placing the secrets in target tenant namespaces for this scenario, but authentication credentials can be added to the global cluster pull secret or individual namespace service accounts to enable the required access.

The following types of images should be considered when determining whether Operators managed by OLM have appropriate pull access:

Index images
A CatalogSource object can reference an index image, which use the Operator bundle format and are catalog sources packaged as container images hosted in images registries. If an index image is hosted in a private registry, a secret can be used to enable pull access.
Bundle images
Operator bundle images are metadata and manifests packaged as container images that represent a unique version of an Operator. If any bundle images referenced in a catalog source are hosted in one or more private registries, a secret can be used to enable pull access.
Operator and Operand images

If an Operator installed from a catalog source uses a private image, either for the Operator image itself or one of the Operand images it watches, the Operator will fail to install because the deployment will not have access to the required registry authentication. Referencing secrets in a catalog source does not enable OLM to place the secrets in target tenant namespaces in which Operands are installed.

Instead, the authentication details can be added to the global cluster pull secret in the openshift-config namespace, which provides access to all namespaces on the cluster. Alternatively, if providing access to the entire cluster is not permissible, the pull secret can be added to the default service accounts of the target tenant namespaces.

Prerequisites

  • You have at least one of the following hosted in a private registry:

    • An index image or catalog image.
    • An Operator bundle image.
    • An Operator or Operand image.
  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Create a secret for each required private registry.

    1. Log in to the private registry to create or update your registry credentials file:

      $ podman login <registry>:<port>
      Note

      The file path of your registry credentials can be different depending on the container tool used to log in to the registry. For the podman CLI, the default location is ${XDG_RUNTIME_DIR}/containers/auth.json. For the docker CLI, the default location is /root/.docker/config.json.

    2. It is recommended to include credentials for only one registry per secret, and manage credentials for multiple registries in separate secrets. Multiple secrets can be included in a CatalogSource object in later steps, and OpenShift Container Platform will merge the secrets into a single virtual credentials file for use during an image pull.

      A registry credentials file can, by default, store details for more than one registry or for multiple repositories in one registry. Verify the current contents of your file. For example:

      File storing credentials for multiple registries

      {
          "auths": {
              "registry.redhat.io": {
                  "auth": "FrNHNydQXdzclNqdg=="
              },
              "quay.io": {
                  "auth": "fegdsRib21iMQ=="
              },
              "https://quay.io/my-namespace/my-user/my-image": {
                  "auth": "eWfjwsDdfsa221=="
              },
              "https://quay.io/my-namespace/my-user": {
                  "auth": "feFweDdscw34rR=="
              },
              "https://quay.io/my-namespace": {
                  "auth": "frwEews4fescyq=="
              }
          }
      }

      Because this file is used to create secrets in later steps, ensure that you are storing details for only one registry per file. This can be accomplished by using either of the following methods:

      • Use the podman logout <registry> command to remove credentials for additional registries until only the one registry you want remains.
      • Edit your registry credentials file and separate the registry details to be stored in multiple files. For example:

        File storing credentials for one registry

        {
                "auths": {
                        "registry.redhat.io": {
                                "auth": "FrNHNydQXdzclNqdg=="
                        }
                }
        }

        File storing credentials for another registry

        {
                "auths": {
                        "quay.io": {
                                "auth": "Xd2lhdsbnRib21iMQ=="
                        }
                }
        }

    3. Create a secret in the openshift-marketplace namespace that contains the authentication credentials for a private registry:

      $ oc create secret generic <secret_name> \
          -n openshift-marketplace \
          --from-file=.dockerconfigjson=<path/to/registry/credentials> \
          --type=kubernetes.io/dockerconfigjson

      Repeat this step to create additional secrets for any other required private registries, updating the --from-file flag to specify another registry credentials file path.

  2. Create or update an existing CatalogSource object to reference one or more secrets:

    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: my-operator-catalog
      namespace: openshift-marketplace
    spec:
      sourceType: grpc
      secrets: 1
      - "<secret_name_1>"
      - "<secret_name_2>"
      grpcPodConfig:
        securityContextConfig: <security_mode> 2
      image: <registry>:<port>/<namespace>/<image>:<tag>
      displayName: My Operator Catalog
      publisher: <publisher_name>
      updateStrategy:
        registryPoll:
          interval: 30m
    1
    Add a spec.secrets section and specify any required secrets.
    2
    Specify the value of legacy or restricted. If the field is not set, the default value is legacy. In a future OpenShift Container Platform release, it is planned that the default value will be restricted. If your catalog cannot run with restricted permissions, it is recommended that you manually set this field to legacy.
  3. If any Operator or Operand images that are referenced by a subscribed Operator require access to a private registry, you can either provide access to all namespaces in the cluster, or individual target tenant namespaces.

    • To provide access to all namespaces in the cluster, add authentication details to the global cluster pull secret in the openshift-config namespace.

      Warning

      Cluster resources must adjust to the new global pull secret, which can temporarily limit the usability of the cluster.

      1. Extract the .dockerconfigjson file from the global pull secret:

        $ oc extract secret/pull-secret -n openshift-config --confirm
      2. Update the .dockerconfigjson file with your authentication credentials for the required private registry or registries and save it as a new file:

        $ cat .dockerconfigjson | \
            jq --compact-output '.auths["<registry>:<port>/<namespace>/"] |= . + {"auth":"<token>"}' \1
            > new_dockerconfigjson
        1
        Replace <registry>:<port>/<namespace> with the private registry details and <token> with your authentication credentials.
      3. Update the global pull secret with the new file:

        $ oc set data secret/pull-secret -n openshift-config \
            --from-file=.dockerconfigjson=new_dockerconfigjson
    • To update an individual namespace, add a pull secret to the service account for the Operator that requires access in the target tenant namespace.

      1. Recreate the secret that you created for the openshift-marketplace in the tenant namespace:

        $ oc create secret generic <secret_name> \
            -n <tenant_namespace> \
            --from-file=.dockerconfigjson=<path/to/registry/credentials> \
            --type=kubernetes.io/dockerconfigjson
      2. Verify the name of the service account for the Operator by searching the tenant namespace:

        $ oc get sa -n <tenant_namespace> 1
        1
        If the Operator was installed in an individual namespace, search that namespace. If the Operator was installed for all namespaces, search the openshift-operators namespace.

        Example output

        NAME            SECRETS   AGE
        builder         2         6m1s
        default         2         6m1s
        deployer        2         6m1s
        etcd-operator   2         5m18s 1

        1
        Service account for an installed etcd Operator.
      3. Link the secret to the service account for the Operator:

        $ oc secrets link <operator_sa> \
            -n <tenant_namespace> \
             <secret_name> \
            --for=pull

Additional resources

4.9.7. Disabling the default OperatorHub catalog sources

Operator catalogs that source content provided by Red Hat and community projects are configured for OperatorHub by default during an OpenShift Container Platform installation. As a cluster administrator, you can disable the set of default catalogs.

Procedure

  • Disable the sources for the default catalogs by adding disableAllDefaultSources: true to the OperatorHub object:

    $ oc patch OperatorHub cluster --type json \
        -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
Tip

Alternatively, you can use the web console to manage catalog sources. From the AdministrationCluster SettingsConfigurationOperatorHub page, click the Sources tab, where you can create, update, delete, disable, and enable individual sources.

4.9.8. Removing custom catalogs

As a cluster administrator, you can remove custom Operator catalogs that have been previously added to your cluster by deleting the related catalog source.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. In the Administrator perspective of the web console, navigate to AdministrationCluster Settings.
  2. Click the Configuration tab, and then click OperatorHub.
  3. Click the Sources tab.
  4. Select the Options menu kebab for the catalog that you want to remove, and then click Delete CatalogSource.

4.10. Using Operator Lifecycle Manager on restricted networks

For OpenShift Container Platform clusters that are installed on restricted networks, also known as disconnected clusters, Operator Lifecycle Manager (OLM) by default cannot access the Red Hat-provided OperatorHub sources hosted on remote registries because those remote sources require full internet connectivity.

However, as a cluster administrator you can still enable your cluster to use OLM in a restricted network if you have a workstation that has full internet access. The workstation, which requires full internet access to pull the remote OperatorHub content, is used to prepare local mirrors of the remote sources, and push the content to a mirror registry.

The mirror registry can be located on a bastion host, which requires connectivity to both your workstation and the disconnected cluster, or a completely disconnected, or airgapped, host, which requires removable media to physically move the mirrored content to the disconnected environment.

This guide describes the following process that is required to enable OLM in restricted networks:

  • Disable the default remote OperatorHub sources for OLM.
  • Use a workstation with full internet access to create and push local mirrors of the OperatorHub content to a mirror registry.
  • Configure OLM to install and manage Operators from local sources on the mirror registry instead of the default remote sources.

After enabling OLM in a restricted network, you can continue to use your unrestricted workstation to keep your local OperatorHub sources updated as newer versions of Operators are released.

Important

While OLM can manage Operators from local sources, the ability for a given Operator to run successfully in a restricted network still depends on the Operator itself meeting the following criteria:

  • List any related images, or other container images that the Operator might require to perform their functions, in the relatedImages parameter of its ClusterServiceVersion (CSV) object.
  • Reference all specified images by a digest (SHA) and not by a tag.

You can search software on the Red Hat Ecosystem Catalog for a list of Red Hat Operators that support running in disconnected mode by filtering with the following selections:

Type

Containerized application

Deployment method

Operator

Infrastructure features

Disconnected

4.10.1. Prerequisites

  • Log in to your OpenShift Container Platform cluster as a user with cluster-admin privileges.
Note

If you are using OLM in a restricted network on IBM Z, you must have at least 12 GB allocated to the directory where you place your registry.

4.10.2. Disabling the default OperatorHub catalog sources

Operator catalogs that source content provided by Red Hat and community projects are configured for OperatorHub by default during an OpenShift Container Platform installation. In a restricted network environment, you must disable the default catalogs as a cluster administrator. You can then configure OperatorHub to use local catalog sources.

Procedure

  • Disable the sources for the default catalogs by adding disableAllDefaultSources: true to the OperatorHub object:

    $ oc patch OperatorHub cluster --type json \
        -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
Tip

Alternatively, you can use the web console to manage catalog sources. From the AdministrationCluster SettingsConfigurationOperatorHub page, click the Sources tab, where you can create, update, delete, disable, and enable individual sources.

4.10.3. Mirroring an Operator catalog

For instructions about mirroring Operator catalogs for use with disconnected clusters, see Installing → Mirroring images for a disconnected installation.

Important

As of OpenShift Container Platform 4.11, the default Red Hat-provided Operator catalog releases in the file-based catalog format. The default Red Hat-provided Operator catalogs for OpenShift Container Platform 4.6 through 4.10 released in the deprecated SQLite database format.

The opm subcommands, flags, and functionality related to the SQLite database format are also deprecated and will be removed in a future release. The features are still supported and must be used for catalogs that use the deprecated SQLite database format.

Many of the opm subcommands and flags for working with the SQLite database format, such as opm index prune, do not work with the file-based catalog format. For more information about working with file-based catalogs, see Operator Framework packaging format, Managing custom catalogs, and Mirroring images for a disconnected installation using the oc-mirror plugin.

4.10.4. Adding a catalog source to a cluster

Adding a catalog source to an OpenShift Container Platform cluster enables the discovery and installation of Operators for users. Cluster administrators can create a CatalogSource object that references an index image. OperatorHub uses catalog sources to populate the user interface.

Tip

Alternatively, you can use the web console to manage catalog sources. From the AdministrationCluster SettingsConfigurationOperatorHub page, click the Sources tab, where you can create, update, delete, disable, and enable individual sources.

Prerequisites

  • You built and pushed an index image to a registry.
  • You have access to the cluster as a user with the cluster-admin role.

Procedure

  1. Create a CatalogSource object that references your index image. If you used the oc adm catalog mirror command to mirror your catalog to a target registry, you can use the generated catalogSource.yaml file in your manifests directory as a starting point.

    1. Modify the following to your specifications and save it as a catalogSource.yaml file:

      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      metadata:
        name: my-operator-catalog 1
        namespace: openshift-marketplace 2
      spec:
        sourceType: grpc
        grpcPodConfig:
          securityContextConfig: <security_mode> 3
        image: <registry>/<namespace>/redhat-operator-index:v4.13 4
        displayName: My Operator Catalog
        publisher: <publisher_name> 5
        updateStrategy:
          registryPoll: 6
            interval: 30m
      1
      If you mirrored content to local files before uploading to a registry, remove any backslash (/) characters from the metadata.name field to avoid an "invalid resource name" error when you create the object.
      2
      If you want the catalog source to be available globally to users in all namespaces, specify the openshift-marketplace namespace. Otherwise, you can specify a different namespace for the catalog to be scoped and available only for that namespace.
      3
      Specify the value of legacy or restricted. If the field is not set, the default value is legacy. In a future OpenShift Container Platform release, it is planned that the default value will be restricted. If your catalog cannot run with restricted permissions, it is recommended that you manually set this field to legacy.
      4
      Specify your index image. If you specify a tag after the image name, for example :v4.13, the catalog source pod uses an image pull policy of Always, meaning the pod always pulls the image prior to starting the container. If you specify a digest, for example @sha256:<id>, the image pull policy is IfNotPresent, meaning the pod pulls the image only if it does not already exist on the node.
      5
      Specify your name or an organization name publishing the catalog.
      6
      Catalog sources can automatically check for new versions to keep up to date.
    2. Use the file to create the CatalogSource object:

      $ oc apply -f catalogSource.yaml
  2. Verify the following resources are created successfully.

    1. Check the pods:

      $ oc get pods -n openshift-marketplace

      Example output

      NAME                                    READY   STATUS    RESTARTS  AGE
      my-operator-catalog-6njx6               1/1     Running   0         28s
      marketplace-operator-d9f549946-96sgr    1/1     Running   0         26h

    2. Check the catalog source:

      $ oc get catalogsource -n openshift-marketplace

      Example output

      NAME                  DISPLAY               TYPE PUBLISHER  AGE
      my-operator-catalog   My Operator Catalog   grpc            5s

    3. Check the package manifest:

      $ oc get packagemanifest -n openshift-marketplace

      Example output

      NAME                          CATALOG               AGE
      jaeger-product                My Operator Catalog   93s

You can now install the Operators from the OperatorHub page on your OpenShift Container Platform web console.

4.11. Catalog source pod scheduling

When an Operator Lifecycle Manager (OLM) catalog source of source type grpc defines a spec.image, the Catalog Operator creates a pod that serves the defined image content. By default, this pod defines the following in its specification:

  • Only the kubernetes.io/os=linux node selector.
  • The default priority class name: system-cluster-critical.
  • No tolerations.

As an administrator, you can override these values by modifying fields in the CatalogSource object’s optional spec.grpcPodConfig section.

Important

The Marketplace Operator, openshift-marketplace, manages the default OperatorHub custom resource’s (CR). This CR manages CatalogSource objects. If you attempt to modify fields in the CatalogSource object’s spec.grpcPodConfig section, the Marketplace Operator automatically reverts these modifications.By default, if you modify fields in the spec.grpcPodConfig section of the CatalogSource object, the Marketplace Operator automatically reverts these changes.

To apply persistent changes to CatalogSource object, you must first disable a default CatalogSource object.

4.11.1. Disabling default CatalogSource objects at a local level

You can apply persistent changes to a CatalogSource object, such as catalog source pods, at a local level, by disabling a default CatalogSource object. Consider the default configuration in situations where the default CatalogSource object’s configuration does not meet your organization’s needs. By default, if you modify fields in the spec.grpcPodConfig section of the CatalogSource object, the Marketplace Operator automatically reverts these changes.

The Marketplace Operator, openshift-marketplace, manages the default custom resources (CRs) of the OperatorHub. The OperatorHub manages CatalogSource objects.

To apply persistent changes to CatalogSource object, you must first disable a default CatalogSource object.

Procedure

  • To disable all the default CatalogSource objects at a local level, enter the following command:

    $ oc patch operatorhub cluster -p '{"spec": {"disableAllDefaultSources": true}}' --type=merge
    Note

    You can also configure the default OperatorHub CR to either disable all CatalogSource objects or disable a specific object.

4.11.2. Overriding the node selector for catalog source pods

Prerequisites

  • A CatalogSource object of source type grpc with spec.image is defined.

Procedure

  • Edit the CatalogSource object and add or modify the spec.grpcPodConfig section to include the following:

      grpcPodConfig:
        nodeSelector:
          custom_label: <label>

    where <label> is the label for the node selector that you want catalog source pods to use for scheduling.

4.11.3. Overriding the priority class name for catalog source pods

Prerequisites

  • A CatalogSource object of source type grpc with spec.image is defined.

Procedure

  • Edit the CatalogSource object and add or modify the spec.grpcPodConfig section to include the following:

      grpcPodConfig:
        priorityClassName: <priority_class>

    where <priority_class> is one of the following:

    • One of the default priority classes provided by Kubernetes: system-cluster-critical or system-node-critical
    • An empty set ("") to assign the default priority
    • A pre-existing and custom defined priority class
Note

Previously, the only pod scheduling parameter that could be overriden was priorityClassName. This was done by adding the operatorframework.io/priorityclass annotation to the CatalogSource object. For example:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: example-catalog
  namespace: openshift-marketplace
  annotations:
    operatorframework.io/priorityclass: system-cluster-critical

If a CatalogSource object defines both the annotation and spec.grpcPodConfig.priorityClassName, the annotation takes precedence over the configuration parameter.

Additional resources

4.11.4. Overriding tolerations for catalog source pods

Prerequisites

  • A CatalogSource object of source type grpc with spec.image is defined.

Procedure

  • Edit the CatalogSource object and add or modify the spec.grpcPodConfig section to include the following:

      grpcPodConfig:
        tolerations:
          - key: "<key_name>"
            operator: "<operator_type>"
            value: "<value>"
            effect: "<effect>"

4.12. Managing platform Operators (Technology Preview)

A platform Operator is an OLM-based Operator that can be installed during or after an OpenShift Container Platform cluster’s Day 0 operations and participates in the cluster’s lifecycle. As a cluster administrator, you can manage platform Operators by using the PlatformOperator API.

Important

The platform Operator type is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

4.12.1. About platform Operators

Operator Lifecycle Manager (OLM) introduces a new type of Operator called platform Operators. A platform Operator is an OLM-based Operator that can be installed during or after an OpenShift Container Platform cluster’s Day 0 operations and participates in the cluster’s lifecycle. As a cluster administrator, you can use platform Operators to further customize your OpenShift Container Platform installation to meet your requirements and use cases.

Using the existing cluster capabilities feature in OpenShift Container Platform, cluster administrators can already disable a subset of Cluster Version Operator-based (CVO) components considered non-essential to the initial payload prior to cluster installation. Platform Operators iterate on this model by providing additional customization options. Through the platform Operator mechanism, which relies on resources from the RukPak component, OLM-based Operators can now be installed at cluster installation time and can block cluster rollout if the Operator fails to install successfully.

In OpenShift Container Platform 4.12, this Technology Preview release focuses on the basic platform Operator mechanism and builds a foundation for expanding the concept in upcoming releases. You can use the cluster-wide PlatformOperator API to configure Operators before or after cluster creation on clusters that have enabled the TechPreviewNoUpgrades feature set.

4.12.1.1. Technology Preview restrictions for platform Operators

During the Technology Preview release of the platform Operators feature in OpenShift Container Platform 4.12, the following restrictions determine whether an Operator can be installed through the platform Operators mechanism:

  • Kubernetes manifests must be packaged using the Operator Lifecycle Manager (OLM) registry+v1 bundle format.
  • The Operator cannot declare package or group/version/kind (GVK) dependencies.
  • The Operator cannot specify cluster service version (CSV) install modes other than AllNamespaces
  • The Operator cannot specify any Webhook or APIService definitions.
  • All package bundles must be in the redhat-operators catalog source.

After considering these restrictions, the following Operators can be successfully installed:

Table 4.2. OLM-based Operators installable as platform Operators

3scale-operator

amq-broker-rhel8

amq-online

amq-streams

ansible-cloud-addons-operator

apicast-operator

container-security-operator

eap

file-integrity-operator

gatekeeper-operator-product

integration-operator

jws-operator

kiali-ossm

node-healthcheck-operator

odf-csi-addons-operator

odr-hub-operator

openshift-custom-metrics-autoscaler-operator

openshift-gitops-operator

openshift-pipelines-operator-rh

quay-operator

red-hat-camel-k

rhpam-kogito-operator

service-registry-operator

servicemeshoperator

skupper-operator

 
Note

The following features are not available during this Technology Preview release:

  • Automatically upgrading platform Operator packages after cluster rollout
  • Extending the platform Operator mechanism to support any optional, CVO-based components

4.12.2. Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with cluster-admin permissions.
  • The TechPreviewNoUpgrades feature set enabled on the cluster.

    Warning

    Enabling the TechPreviewNoUpgrade feature set cannot be undone and prevents minor version updates. These feature sets are not recommended on production clusters.

  • Only the redhat-operators catalog source enabled on the cluster. This is a restriction during the Technology Preview release.
  • The oc command installed on your workstation.

4.12.3. Installing platform Operators during cluster creation

As a cluster administrator, you can install platform Operators by providing FeatureGate and PlatformOperator manifests during cluster creation.

Procedure

  1. Choose a platform Operator from the supported set of OLM-based Operators. For the list of this set and details on current limitations, see "Technology Preview restrictions for platform Operators".
  2. Select a cluster installation method and follow the instructions through creating an install-config.yaml file. For more details on preparing for a cluster installation, see "Selecting a cluster installation method and preparing it for users".
  3. After you have created the install-config.yaml file and completed any modifications to it, change to the directory that contains the installation program and create the manifests:

    $ ./openshift-install create manifests --dir <installation_directory> 1
    1
    For <installation_directory>, specify the name of the directory that contains the install-config.yaml file for your cluster.
  4. Create a FeatureGate object YAML file in the <installation_directory>/manifests/ directory that enables the TechPreviewNoUpgrade feature set, for example a feature-gate.yaml file:

    Example feature-gate.yaml file

    apiVersion: config.openshift.io/v1
    kind: FeatureGate
    metadata:
      annotations:
        include.release.openshift.io/self-managed-high-availability: "true"
        include.release.openshift.io/single-node-developer: "true"
        release.openshift.io/create-only: "true"
      name: cluster
    spec:
      featureSet: TechPreviewNoUpgrade 1

    1
    Enable the TechPreviewNoUpgrade feature set.
  5. Create a PlatformOperator object YAML file for your chosen platform Operator in the <installation_directory>/manifests/ directory, for example a service-mesh-po.yaml file for the Red Hat OpenShift Service Mesh Operator:

    Example service-mesh-po.yaml file

    apiVersion: platform.openshift.io/v1alpha1
    kind: PlatformOperator
    metadata:
      name: service-mesh-po
    spec:
      package:
        name: servicemeshoperator

  6. When you are ready to complete the cluster install, refer to your chosen installation method and continue through running the openshift-install create cluster command.

    During cluster creation, your provided manifests are used to enable the TechPreviewNoUpgrade feature set and install your chosen platform Operator.

    Important

    Failure of the platform Operator to successfully install will block the cluster installation process.

Verification

  1. Check the status of the service-mesh-po platform Operator by running the following command:

    $ oc get platformoperator service-mesh-po -o yaml

    Example output

    ...
    status:
      activeBundleDeployment:
        name: service-mesh-po
      conditions:
      - lastTransitionTime: "2022-10-24T17:24:40Z"
        message: Successfully applied the service-mesh-po BundleDeployment resource
        reason: InstallSuccessful
        status: "True" 1
        type: Installed

    1
    Wait until the Installed status condition reports True.
  2. Verify that the platform-operators-aggregated cluster Operator is reporting an Available=True status:

    $ oc get clusteroperator platform-operators-aggregated -o yaml

    Example output

    ...
    status:
      conditions:
      - lastTransitionTime: "2022-10-24T17:43:26Z"
        message: All platform operators are in a successful state
        reason: AsExpected
        status: "False"
        type: Progressing
      - lastTransitionTime: "2022-10-24T17:43:26Z"
        status: "False"
        type: Degraded
      - lastTransitionTime: "2022-10-24T17:43:26Z"
        message: All platform operators are in a successful state
        reason: AsExpected
        status: "True"
        type: Available

4.12.4. Installing platform Operators after cluster creation

As a cluster administrator, you can install platform Operators after cluster creation on clusters that have enabled the TechPreviewNoUpgrades feature set by using the cluster-wide PlatformOperator API.

Procedure

  1. Choose a platform Operator from the supported set of OLM-based Operators. For the list of this set and details on current limitations, see "Technology Preview restrictions for platform Operators".
  2. Create a PlatformOperator object YAML file for your chosen platform Operator, for example a service-mesh-po.yaml file for the Red Hat OpenShift Service Mesh Operator:

    Example sevice-mesh-po.yaml file

    apiVersion: platform.openshift.io/v1alpha1
    kind: PlatformOperator
    metadata:
      name: service-mesh-po
    spec:
      package:
        name: servicemeshoperator

  3. Create the PlatformOperator object by running the following command:

    $ oc apply -f service-mesh-po.yaml
    Note

    If your cluster does not have the TechPreviewNoUpgrades feature set enabled, the object creation fails with the following message:

    error: resource mapping not found for name: "service-mesh-po" namespace: "" from "service-mesh-po.yaml": no matches for kind "PlatformOperator" in version "platform.openshift.io/v1alpha1"
    ensure CRDs are installed first

Verification

  1. Check the status of the service-mesh-po platform Operator by running the following command:

    $ oc get platformoperator service-mesh-po -o yaml

    Example output

    ...
    status:
      activeBundleDeployment:
        name: service-mesh-po
      conditions:
      - lastTransitionTime: "2022-10-24T17:24:40Z"
        message: Successfully applied the service-mesh-po BundleDeployment resource
        reason: InstallSuccessful
        status: "True" 1
        type: Installed

    1
    Wait until the Installed status condition reports True.
  2. Verify that the platform-operators-aggregated cluster Operator is reporting an Available=True status:

    $ oc get clusteroperator platform-operators-aggregated -o yaml

    Example output

    ...
    status:
      conditions:
      - lastTransitionTime: "2022-10-24T17:43:26Z"
        message: All platform operators are in a successful state
        reason: AsExpected
        status: "False"
        type: Progressing
      - lastTransitionTime: "2022-10-24T17:43:26Z"
        status: "False"
        type: Degraded
      - lastTransitionTime: "2022-10-24T17:43:26Z"
        message: All platform operators are in a successful state
        reason: AsExpected
        status: "True"
        type: Available

4.12.5. Deleting platform Operators

As a cluster administrator, you can delete existing platform Operators. Operator Lifecycle Manager (OLM) performs a cascading deletion. First, OLM removes the bundle deployment for the platform Operator, which then deletes any objects referenced in the registry+v1 type bundle.

Note

The platform Operator manager and bundle deployment provisioner only manage objects that are referenced in the bundle, but not objects subsequently deployed by any bundle workloads themselves. For example, if a bundle workload creates a namespace and the Operator is not configured to clean it up before the Operator is removed, it is outside of the scope of OLM to remove the namespace during platform Operator deletion.

Procedure

  1. Get a list of installed platform Operators and find the name for the Operator you want to delete:

    $ oc get platformoperator
  2. Delete the PlatformOperator resource for the chosen Operator, for example, for the Quay Operator:

    $ oc delete platformoperator quay-operator

    Example output

    platformoperator.platform.openshift.io "quay-operator" deleted

Verification

  1. Verify the namespace for the platform Operator is eventually deleted, for example, for the Quay Operator:

    $ oc get ns quay-operator-system

    Example output

    Error from server (NotFound): namespaces "quay-operator-system" not found

  2. Verify the platform-operators-aggregated cluster Operator continues to report an Available=True status:

    $ oc get co platform-operators-aggregated

    Example output

    NAME                            VERSION     AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
    platform-operators-aggregated   4.13.0-0    True        False         False      70s

4.13. Troubleshooting Operator issues

If you experience Operator issues, verify Operator subscription status. Check Operator pod health across the cluster and gather Operator logs for diagnosis.

4.13.1. Operator subscription condition types

Subscriptions can report the following condition types:

Table 4.3. Subscription condition types

ConditionDescription

CatalogSourcesUnhealthy

Some or all of the catalog sources to be used in resolution are unhealthy.

InstallPlanMissing

An install plan for a subscription is missing.

InstallPlanPending

An install plan for a subscription is pending installation.

InstallPlanFailed

An install plan for a subscription has failed.

ResolutionFailed

The dependency resolution for a subscription has failed.

Note

Default OpenShift Container Platform cluster Operators are managed by the Cluster Version Operator (CVO) and they do not have a Subscription object. Application Operators are managed by Operator Lifecycle Manager (OLM) and they have a Subscription object.

Additional resources

4.13.2. Viewing Operator subscription status by using the CLI

You can view Operator subscription status by using the CLI.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. List Operator subscriptions:

    $ oc get subs -n <operator_namespace>
  2. Use the oc describe command to inspect a Subscription resource:

    $ oc describe sub <subscription_name> -n <operator_namespace>
  3. In the command output, find the Conditions section for the status of Operator subscription condition types. In the following example, the CatalogSourcesUnhealthy condition type has a status of false because all available catalog sources are healthy:

    Example output

    Name:         cluster-logging
    Namespace:    openshift-logging
    Labels:       operators.coreos.com/cluster-logging.openshift-logging=
    Annotations:  <none>
    API Version:  operators.coreos.com/v1alpha1
    Kind:         Subscription
    # ...
    Conditions:
       Last Transition Time:  2019-07-29T13:42:57Z
       Message:               all available catalogsources are healthy
       Reason:                AllCatalogSourcesHealthy
       Status:                False
       Type:                  CatalogSourcesUnhealthy
    # ...

Note

Default OpenShift Container Platform cluster Operators are managed by the Cluster Version Operator (CVO) and they do not have a Subscription object. Application Operators are managed by Operator Lifecycle Manager (OLM) and they have a Subscription object.

4.13.3. Viewing Operator catalog source status by using the CLI

You can view the status of an Operator catalog source by using the CLI.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. List the catalog sources in a namespace. For example, you can check the openshift-marketplace namespace, which is used for cluster-wide catalog sources:

    $ oc get catalogsources -n openshift-marketplace

    Example output

    NAME                  DISPLAY               TYPE   PUBLISHER   AGE
    certified-operators   Certified Operators   grpc   Red Hat     55m
    community-operators   Community Operators   grpc   Red Hat     55m
    example-catalog       Example Catalog       grpc   Example Org 2m25s
    redhat-marketplace    Red Hat Marketplace   grpc   Red Hat     55m
    redhat-operators      Red Hat Operators     grpc   Red Hat     55m

  2. Use the oc describe command to get more details and status about a catalog source:

    $ oc describe catalogsource example-catalog -n openshift-marketplace

    Example output

    Name:         example-catalog
    Namespace:    openshift-marketplace
    Labels:       <none>
    Annotations:  operatorframework.io/managed-by: marketplace-operator
                  target.workload.openshift.io/management: {"effect": "PreferredDuringScheduling"}
    API Version:  operators.coreos.com/v1alpha1
    Kind:         CatalogSource
    # ...
    Status:
      Connection State:
        Address:              example-catalog.openshift-marketplace.svc:50051
        Last Connect:         2021-09-09T17:07:35Z
        Last Observed State:  TRANSIENT_FAILURE
      Registry Service:
        Created At:         2021-09-09T17:05:45Z
        Port:               50051
        Protocol:           grpc
        Service Name:       example-catalog
        Service Namespace:  openshift-marketplace
    # ...

    In the preceding example output, the last observed state is TRANSIENT_FAILURE. This state indicates that there is a problem establishing a connection for the catalog source.

  3. List the pods in the namespace where your catalog source was created:

    $ oc get pods -n openshift-marketplace

    Example output

    NAME                                    READY   STATUS             RESTARTS   AGE
    certified-operators-cv9nn               1/1     Running            0          36m
    community-operators-6v8lp               1/1     Running            0          36m
    marketplace-operator-86bfc75f9b-jkgbc   1/1     Running            0          42m
    example-catalog-bwt8z                   0/1     ImagePullBackOff   0          3m55s
    redhat-marketplace-57p8c                1/1     Running            0          36m
    redhat-operators-smxx8                  1/1     Running            0          36m

    When a catalog source is created in a namespace, a pod for the catalog source is created in that namespace. In the preceding example output, the status for the example-catalog-bwt8z pod is ImagePullBackOff. This status indicates that there is an issue pulling the catalog source’s index image.

  4. Use the oc describe command to inspect a pod for more detailed information:

    $ oc describe pod example-catalog-bwt8z -n openshift-marketplace

    Example output

    Name:         example-catalog-bwt8z
    Namespace:    openshift-marketplace
    Priority:     0
    Node:         ci-ln-jyryyg2-f76d1-ggdbq-worker-b-vsxjd/10.0.128.2
    ...
    Events:
      Type     Reason          Age                From               Message
      ----     ------          ----               ----               -------
      Normal   Scheduled       48s                default-scheduler  Successfully assigned openshift-marketplace/example-catalog-bwt8z to ci-ln-jyryyf2-f76d1-fgdbq-worker-b-vsxjd
      Normal   AddedInterface  47s                multus             Add eth0 [10.131.0.40/23] from openshift-sdn
      Normal   BackOff         20s (x2 over 46s)  kubelet            Back-off pulling image "quay.io/example-org/example-catalog:v1"
      Warning  Failed          20s (x2 over 46s)  kubelet            Error: ImagePullBackOff
      Normal   Pulling         8s (x3 over 47s)   kubelet            Pulling image "quay.io/example-org/example-catalog:v1"
      Warning  Failed          8s (x3 over 47s)   kubelet            Failed to pull image "quay.io/example-org/example-catalog:v1": rpc error: code = Unknown desc = reading manifest v1 in quay.io/example-org/example-catalog: unauthorized: access to the requested resource is not authorized
      Warning  Failed          8s (x3 over 47s)   kubelet            Error: ErrImagePull

    In the preceding example output, the error messages indicate that the catalog source’s index image is failing to pull successfully because of an authorization issue. For example, the index image might be stored in a registry that requires login credentials.

4.13.4. Querying Operator pod status

You can list Operator pods within a cluster and their status. You can also collect a detailed Operator pod summary.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • Your API service is still functional.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. List Operators running in the cluster. The output includes Operator version, availability, and up-time information:

    $ oc get clusteroperators
  2. List Operator pods running in the Operator’s namespace, plus pod status, restarts, and age:

    $ oc get pod -n <operator_namespace>
  3. Output a detailed Operator pod summary:

    $ oc describe pod <operator_pod_name> -n <operator_namespace>
  4. If an Operator issue is node-specific, query Operator container status on that node.

    1. Start a debug pod for the node:

      $ oc debug node/my-node
    2. Set /host as the root directory within the debug shell. The debug pod mounts the host’s root file system in /host within the pod. By changing the root directory to /host, you can run binaries contained in the host’s executable paths:

      # chroot /host
      Note

      OpenShift Container Platform 4.13 cluster nodes running Red Hat Enterprise Linux CoreOS (RHCOS) are immutable and rely on Operators to apply cluster changes. Accessing cluster nodes by using SSH is not recommended. However, if the OpenShift Container Platform API is not available, or the kubelet is not properly functioning on the target node, oc operations will be impacted. In such situations, it is possible to access nodes using ssh core@<node>.<cluster_name>.<base_domain> instead.

    3. List details about the node’s containers, including state and associated pod IDs:

      # crictl ps
    4. List information about a specific Operator container on the node. The following example lists information about the network-operator container:

      # crictl ps --name network-operator
    5. Exit from the debug shell.

4.13.5. Gathering Operator logs

If you experience Operator issues, you can gather detailed diagnostic information from Operator pod logs.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • Your API service is still functional.
  • You have installed the OpenShift CLI (oc).
  • You have the fully qualified domain names of the control plane or control plane machines.

Procedure

  1. List the Operator pods that are running in the Operator’s namespace, plus the pod status, restarts, and age:

    $ oc get pods -n <operator_namespace>
  2. Review logs for an Operator pod:

    $ oc logs pod/<pod_name> -n <operator_namespace>

    If an Operator pod has multiple containers, the preceding command will produce an error that includes the name of each container. Query logs from an individual container:

    $ oc logs pod/<operator_pod_name> -c <container_name> -n <operator_namespace>
  3. If the API is not functional, review Operator pod and container logs on each control plane node by using SSH instead. Replace <master-node>.<cluster_name>.<base_domain> with appropriate values.

    1. List pods on each control plane node:

      $ ssh core@<master-node>.<cluster_name>.<base_domain> sudo crictl pods
    2. For any Operator pods not showing a Ready status, inspect the pod’s status in detail. Replace <operator_pod_id> with the Operator pod’s ID listed in the output of the preceding command:

      $ ssh core@<master-node>.<cluster_name>.<base_domain> sudo crictl inspectp <operator_pod_id>
    3. List containers related to an Operator pod:

      $ ssh core@<master-node>.<cluster_name>.<base_domain> sudo crictl ps --pod=<operator_pod_id>
    4. For any Operator container not showing a Ready status, inspect the container’s status in detail. Replace <container_id> with a container ID listed in the output of the preceding command:

      $ ssh core@<master-node>.<cluster_name>.<base_domain> sudo crictl inspect <container_id>
    5. Review the logs for any Operator containers not showing a Ready status. Replace <container_id> with a container ID listed in the output of the preceding command:

      $ ssh core@<master-node>.<cluster_name>.<base_domain> sudo crictl logs -f <container_id>
      Note

      OpenShift Container Platform 4.13 cluster nodes running Red Hat Enterprise Linux CoreOS (RHCOS) are immutable and rely on Operators to apply cluster changes. Accessing cluster nodes by using SSH is not recommended. Before attempting to collect diagnostic data over SSH, review whether the data collected by running oc adm must gather and other oc commands is sufficient instead. However, if the OpenShift Container Platform API is not available, or the kubelet is not properly functioning on the target node, oc operations will be impacted. In such situations, it is possible to access nodes using ssh core@<node>.<cluster_name>.<base_domain>.

4.13.6. Disabling the Machine Config Operator from automatically rebooting

When configuration changes are made by the Machine Config Operator (MCO), Red Hat Enterprise Linux CoreOS (RHCOS) must reboot for the changes to take effect. Whether the configuration change is automatic or manual, an RHCOS node reboots automatically unless it is paused.

Note

The following modifications do not trigger a node reboot:

  • When the MCO detects any of the following changes, it applies the update without draining or rebooting the node:

    • Changes to the SSH key in the spec.config.passwd.users.sshAuthorizedKeys parameter of a machine config.
    • Changes to the global pull secret or pull secret in the openshift-config namespace.
    • Automatic rotation of the /etc/kubernetes/kubelet-ca.crt certificate authority (CA) by the Kubernetes API Server Operator.
  • When the MCO detects changes to the /etc/containers/registries.conf file, such as adding or editing an ImageDigestMirrorSet, ImageTagMirrorSet, or ImageContentSourcePolicy object, it drains the corresponding nodes, applies the changes, and uncordons the nodes. The node drain does not happen for the following changes:

    • The addition of a registry with the pull-from-mirror = "digest-only" parameter set for each mirror.
    • The addition of a mirror with the pull-from-mirror = "digest-only" parameter set in a registry.
    • The addition of items to the unqualified-search-registries list.

To avoid unwanted disruptions, you can modify the machine config pool (MCP) to prevent automatic rebooting after the Operator makes changes to the machine config.

Note

Pausing an MCP prevents the MCO from applying any configuration changes on the associated nodes. Pausing an MCP also prevents any automatically rotated certificates from being pushed to the associated nodes, including the automatic rotation of the kube-apiserver-to-kubelet-signer CA certificate.

If the MCP is paused when the kube-apiserver-to-kubelet-signer CA certificate expires, and the MCO attempts to renew the certificate automatically, the MCO cannot push the newly rotated certificates to those nodes. This causes the cluster to become degraded and causes failure in multiple oc commands, including oc debug, oc logs, oc exec, and oc attach. You receive alerts in the Alerting UI of the OpenShift Container Platform web console if an MCP is paused when the certificates are rotated.

Pausing an MCP should be done with careful consideration about the kube-apiserver-to-kubelet-signer CA certificate expiration and for short periods of time only.

New CA certificates are generated at 292 days from the installation date and removed at 365 days from that date. To determine the next automatic CA certificate rotation, see the Understand CA cert auto renewal in Red Hat OpenShift 4.

4.13.6.1. Disabling the Machine Config Operator from automatically rebooting by using the console

To avoid unwanted disruptions from changes made by the Machine Config Operator (MCO), you can use the OpenShift Container Platform web console to modify the machine config pool (MCP) to prevent the MCO from making any changes to nodes in that pool. This prevents any reboots that would normally be part of the MCO update process.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

Procedure

To pause or unpause automatic MCO update rebooting:

  • Pause the autoreboot process:

    1. Log in to the OpenShift Container Platform web console as a user with the cluster-admin role.
    2. Click ComputeMachineConfigPools.
    3. On the MachineConfigPools page, click either master or worker, depending upon which nodes you want to pause rebooting for.
    4. On the master or worker page, click YAML.
    5. In the YAML, update the spec.paused field to true.

      Sample MachineConfigPool object

      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfigPool
      # ...
      spec:
      # ...
        paused: true 1
      # ...

      1
      Update the spec.paused field to true to pause rebooting.
    6. To verify that the MCP is paused, return to the MachineConfigPools page.

      On the MachineConfigPools page, the Paused column reports True for the MCP you modified.

      If the MCP has pending changes while paused, the Updated column is False and Updating is False. When Updated is True and Updating is False, there are no pending changes.

      Important

      If there are pending changes (where both the Updated and Updating columns are False), it is recommended to schedule a maintenance window for a reboot as early as possible. Use the following steps for unpausing the autoreboot process to apply the changes that were queued since the last reboot.

  • Unpause the autoreboot process:

    1. Log in to the OpenShift Container Platform web console as a user with the cluster-admin role.
    2. Click ComputeMachineConfigPools.
    3. On the MachineConfigPools page, click either master or worker, depending upon which nodes you want to pause rebooting for.
    4. On the master or worker page, click YAML.
    5. In the YAML, update the spec.paused field to false.

      Sample MachineConfigPool object

      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfigPool
      # ...
      spec:
      # ...
        paused: false 1
      # ...

      1
      Update the spec.paused field to false to allow rebooting.
      Note

      By unpausing an MCP, the MCO applies all paused changes reboots Red Hat Enterprise Linux CoreOS (RHCOS) as needed.

    6. To verify that the MCP is paused, return to the MachineConfigPools page.

      On the MachineConfigPools page, the Paused column reports False for the MCP you modified.

      If the MCP is applying any pending changes, the Updated column is False and the Updating column is True. When Updated is True and Updating is False, there are no further changes being made.

4.13.6.2. Disabling the Machine Config Operator from automatically rebooting by using the CLI

To avoid unwanted disruptions from changes made by the Machine Config Operator (MCO), you can modify the machine config pool (MCP) using the OpenShift CLI (oc) to prevent the MCO from making any changes to nodes in that pool. This prevents any reboots that would normally be part of the MCO update process.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

To pause or unpause automatic MCO update rebooting:

  • Pause the autoreboot process:

    1. Update the MachineConfigPool custom resource to set the spec.paused field to true.

      Control plane (master) nodes

      $ oc patch --type=merge --patch='{"spec":{"paused":true}}' machineconfigpool/master

      Worker nodes

      $ oc patch --type=merge --patch='{"spec":{"paused":true}}' machineconfigpool/worker

    2. Verify that the MCP is paused:

      Control plane (master) nodes

      $ oc get machineconfigpool/master --template='{{.spec.paused}}'

      Worker nodes

      $ oc get machineconfigpool/worker --template='{{.spec.paused}}'

      Example output

      true

      The spec.paused field is true and the MCP is paused.

    3. Determine if the MCP has pending changes:

      # oc get machineconfigpool

      Example output

      NAME     CONFIG                                             UPDATED   UPDATING
      master   rendered-master-33cf0a1254318755d7b48002c597bf91   True      False
      worker   rendered-worker-e405a5bdb0db1295acea08bcca33fa60   False     False

      If the UPDATED column is False and UPDATING is False, there are pending changes. When UPDATED is True and UPDATING is False, there are no pending changes. In the previous example, the worker node has pending changes. The control plane node does not have any pending changes.

      Important

      If there are pending changes (where both the Updated and Updating columns are False), it is recommended to schedule a maintenance window for a reboot as early as possible. Use the following steps for unpausing the autoreboot process to apply the changes that were queued since the last reboot.

  • Unpause the autoreboot process:

    1. Update the MachineConfigPool custom resource to set the spec.paused field to false.

      Control plane (master) nodes

      $ oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/master

      Worker nodes

      $ oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/worker

      Note

      By unpausing an MCP, the MCO applies all paused changes and reboots Red Hat Enterprise Linux CoreOS (RHCOS) as needed.

    2. Verify that the MCP is unpaused:

      Control plane (master) nodes

      $ oc get machineconfigpool/master --template='{{.spec.paused}}'

      Worker nodes

      $ oc get machineconfigpool/worker --template='{{.spec.paused}}'

      Example output

      false

      The spec.paused field is false and the MCP is unpaused.

    3. Determine if the MCP has pending changes:

      $ oc get machineconfigpool

      Example output

      NAME     CONFIG                                   UPDATED  UPDATING
      master   rendered-master-546383f80705bd5aeaba93   True     False
      worker   rendered-worker-b4c51bb33ccaae6fc4a6a5   False    True

      If the MCP is applying any pending changes, the UPDATED column is False and the UPDATING column is True. When UPDATED is True and UPDATING is False, there are no further changes being made. In the previous example, the MCO is updating the worker node.

4.13.7. Refreshing failing subscriptions

In Operator Lifecycle Manager (OLM), if you subscribe to an Operator that references images that are not accessible on your network, you can find jobs in the openshift-marketplace namespace that are failing with the following errors:

Example output

ImagePullBackOff for
Back-off pulling image "example.com/openshift4/ose-elasticsearch-operator-bundle@sha256:6d2587129c846ec28d384540322b40b05833e7e00b25cca584e004af9a1d292e"

Example output

rpc error: code = Unknown desc = error pinging docker registry example.com: Get "https://example.com/v2/": dial tcp: lookup example.com on 10.0.0.1:53: no such host

As a result, the subscription is stuck in this failing state and the Operator is unable to install or upgrade.

You can refresh a failing subscription by deleting the subscription, cluster service version (CSV), and other related objects. After recreating the subscription, OLM then reinstalls the correct version of the Operator.

Prerequisites

  • You have a failing subscription that is unable to pull an inaccessible bundle image.
  • You have confirmed that the correct bundle image is accessible.

Procedure

  1. Get the names of the Subscription and ClusterServiceVersion objects from the namespace where the Operator is installed:

    $ oc get sub,csv -n <namespace>

    Example output

    NAME                                                       PACKAGE                  SOURCE             CHANNEL
    subscription.operators.coreos.com/elasticsearch-operator   elasticsearch-operator   redhat-operators   5.0
    
    NAME                                                                         DISPLAY                            VERSION    REPLACES   PHASE
    clusterserviceversion.operators.coreos.com/elasticsearch-operator.5.0.0-65   OpenShift Elasticsearch Operator   5.0.0-65              Succeeded

  2. Delete the subscription:

    $ oc delete subscription <subscription_name> -n <namespace>
  3. Delete the cluster service version:

    $ oc delete csv <csv_name> -n <namespace>
  4. Get the names of any failing jobs and related config maps in the openshift-marketplace namespace:

    $ oc get job,configmap -n openshift-marketplace

    Example output

    NAME                                                                        COMPLETIONS   DURATION   AGE
    job.batch/1de9443b6324e629ddf31fed0a853a121275806170e34c926d69e53a7fcbccb   1/1           26s        9m30s
    
    NAME                                                                        DATA   AGE
    configmap/1de9443b6324e629ddf31fed0a853a121275806170e34c926d69e53a7fcbccb   3      9m30s

  5. Delete the job:

    $ oc delete job <job_name> -n openshift-marketplace

    This ensures pods that try to pull the inaccessible image are not recreated.

  6. Delete the config map:

    $ oc delete configmap <configmap_name> -n openshift-marketplace
  7. Reinstall the Operator using OperatorHub in the web console.

Verification

  • Check that the Operator has been reinstalled successfully:

    $ oc get sub,csv,installplan -n <namespace>

4.13.8. Reinstalling Operators after failed uninstallation

You must successfully and completely uninstall an Operator prior to attempting to reinstall the same Operator. Failure to fully uninstall the Operator properly can leave resources, such as a project or namespace, stuck in a "Terminating" state and cause "error resolving resource" messages. For example:

Example Project resource description

...
    message: 'Failed to delete all resource types, 1 remaining: Internal error occurred:
      error resolving resource'
...

These types of issues can prevent an Operator from being reinstalled successfully.

Warning

Forced deletion of a namespace is not likely to resolve "Terminating" state issues and can lead to unstable or unpredictable cluster behavior, so it is better to try to find related resources that might be preventing the namespace from being deleted. For more information, see the Red Hat Knowledgebase Solution #4165791, paying careful attention to the cautions and warnings.

The following procedure shows how to troubleshoot when an Operator cannot be reinstalled because an existing custom resource definition (CRD) from a previous installation of the Operator is preventing a related namespace from deleting successfully.

Procedure

  1. Check if there are any namespaces related to the Operator that are stuck in "Terminating" state:

    $ oc get namespaces

    Example output

    operator-ns-1                                       Terminating

  2. Check if there are any CRDs related to the Operator that are still present after the failed uninstallation:

    $ oc get crds
    Note

    CRDs are global cluster definitions; the actual custom resource (CR) instances related to the CRDs could be in other namespaces or be global cluster instances.

  3. If there are any CRDs that you know were provided or managed by the Operator and that should have been deleted after uninstallation, delete the CRD:

    $ oc delete crd <crd_name>
  4. Check if there are any remaining CR instances related to the Operator that are still present after uninstallation, and if so, delete the CRs:

    1. The type of CRs to search for can be difficult to determine after uninstallation and can require knowing what CRDs the Operator manages. For example, if you are troubleshooting an uninstallation of the etcd Operator, which provides the EtcdCluster CRD, you can search for remaining EtcdCluster CRs in a namespace:

      $ oc get EtcdCluster -n <namespace_name>

      Alternatively, you can search across all namespaces:

      $ oc get EtcdCluster --all-namespaces
    2. If there are any remaining CRs that should be removed, delete the instances:

      $ oc delete <cr_name> <cr_instance_name> -n <namespace_name>
  5. Check that the namespace deletion has successfully resolved:

    $ oc get namespace <namespace_name>
    Important

    If the namespace or other Operator resources are still not uninstalled cleanly, contact Red Hat Support.

  6. Reinstall the Operator using OperatorHub in the web console.

Verification

  • Check that the Operator has been reinstalled successfully:

    $ oc get sub,csv,installplan -n <namespace>