Cgroup v2 on Red Hat OpenShift Container Platform: general information

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Cgroup v2

Issue

  • What are the advantages of cgroup v2 over v1?
  • Which cgroup versions are supported for every RHOCP release?
    • What are the deprecation and removal dates of cgroup v1?
    • If the nodes of a cluster are running cgroup v1 and it is tried to upgrade it to a RHOCP version only compatible with cgroup v2, what would be the consequences? Would the upgrade be blocked?
  • Can the cgroup version in an existing cluster be upgraded or downgraded?
  • Is it supported to mix nodes with cgroup v1 and v2 in the same cluster?

Resolution

Check cgroup v2 goes GA in OpenShift 4.13 for improvements and additional information about cgroup v2.

Note: all the nodes in a cluster must run the same cgroup version, meaning hybrid configurations (cgroup v1 nodes mixed with cgroup v2 nodes) are not supported.

Application workloads perspective

Before changing cgroup to v2 (which is needed for upgrading to Openshift 4.19 and later versions), it is required the adaptation of custom application workloads for working with cgroup v2.

Red Hat Middleware software and OpenJDK images

Refer to what Red Hat Middleware software is cgroups v2 compatible for information about all Red Hat Middleware products. Specifically for EAP 7 containers, refer to EAP 7 images cgroup version.

The newer OpenJDK images provided by Red Hat includes detection of cgroups: cgroup v2 in OpenJDK container in RHOCP 4.

OpenShift Container Platform 4.19 and newer versions

Starting with OpenShift 4.19, cgroup v1 is removed and no longer supported, and it cannot be downgraded to v1.

All the nodes in a cluster must run cgroup v2 before being upgraded to OpenShift 4.19 or a later version. Otherwise the upgrade will be blocked with the following message:

clusteroperator/machine-config is not upgradeable because Cluster is using deprecated cgroup v1, which is removed in 4.19.  Please update the ‘cgroupMode’ in the ‘cluster’ object of nodes.config.openshift.io resource type to ‘v2’. This can be changed back to ‘v1’ while on 4.18, but must be ‘v2’ before you update to 4.19.  Once updated to 4.19, cgroup v1 is no longer an option. Please refer to https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html-single/nodes/index#nodes-clusters-cgroups-2_nodes-cluster-cgroups-2

OpenShift Container Platform 4.13 to 4.18 versions

The status of cgroup v2 in 4.13 to 4.18 depends on the version:

IMPORTANT NOTE: if a cluster is being upgraded, cgroup configuration is not updated automatically.

Changing between cgroup v2 and cgroup v1

To switch to cgroup v2 in a cluster upgraded from RHOCP 4.13 and earlier versions, refer to the documentation for configuring Linux cgroup.

If needed, cgroup can be also changed from version 2 to version 1, only in OpenShift versions older than 4.19.

Root Cause

In any new installation of OpenShift 4.14 or newer releases, the cgroup v2 comes as the default configuration. However, if a cluster is being upgraded from older versions, cgroup is not updated automatically (in any capacity), this is true for all upgrades, and therefore when upgrading to OpenShift 4.19 (which does not support cgroup v1) the upgrade will be blocked.

One notable issue with Machine Config rollout that could impact Machine Config changes (like cgroup version change), getting stuck due to extremely slow node (VM) causing the MachineConfigDaemon to take far too long and get in inconsistent state. For this information about this, please refer to this article.

Diagnostic Steps

  • Check if cgroupMode is configured in the nodes.config cluster resource:

    $ oc get nodes.config cluster -o yaml | grep cgroupMode
      cgroupMode: v2
    
  • Check if the Machine Configs in use have the configuration for cgroup v2:

    $ oc get mcp
    NAME     CONFIG                     UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
    master   rendered-master-xxxxxxxx   True      False      False      3              3                   3                     0                      3y
    worker   rendered-worker-yyyyyyyy   True      False      False      9              9                   9                     0                      3y
    
    $ oc get mc rendered-master-xxxxxxxx rendered-worker-yyyyyyyy -o yaml | grep -A3 "kernelArguments:"
        kernelArguments:
        - systemd.unified_cgroup_hierarchy=1
        - cgroup_no_v1="all"
        - psi=0
    --
        kernelArguments:
        - systemd.unified_cgroup_hierarchy=1
        - cgroup_no_v1="all"
        - psi=0
    
  • Check the cgroup configured in the nodes, checking for cgroup2fs or tmpfs in /sys/fs/cgroup:

    $ for NODE in $(oc get nodes -o name); do echo "------ ${NODE} ------" ; oc debug ${NODE} -q -- chroot /host bash -c "stat -c %T -f /sys/fs/cgroup" ; done
    ------ node/master-0 ------
    cgroup2fs
    [...]
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments