Cgroup v2 on Red Hat OpenShift Container Platform: general information
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Cgroup v2
Issue
- What are the advantages of cgroup v2 over v1?
- Which cgroup versions are supported for every RHOCP release?
- What are the deprecation and removal dates of cgroup v1?
- If the nodes of a cluster are running cgroup v1 and it is tried to upgrade it to a RHOCP version only compatible with cgroup v2, what would be the consequences? Would the upgrade be blocked?
- Can the cgroup version in an existing cluster be upgraded or downgraded?
- Is it supported to mix nodes with cgroup v1 and v2 in the same cluster?
Resolution
Check cgroup v2 goes GA in OpenShift 4.13 for improvements and additional information about cgroup v2.
Note: all the nodes in a cluster must run the same cgroup version, meaning hybrid configurations (cgroup v1 nodes mixed with cgroup v2 nodes) are not supported.
Application workloads perspective
Before changing cgroup to v2 (which is needed for upgrading to Openshift 4.19 and later versions), it is required the adaptation of custom application workloads for working with cgroup v2.
Red Hat Middleware software and OpenJDK images
Refer to what Red Hat Middleware software is cgroups v2 compatible for information about all Red Hat Middleware products. Specifically for EAP 7 containers, refer to EAP 7 images cgroup version.
The newer OpenJDK images provided by Red Hat includes detection of cgroups: cgroup v2 in OpenJDK container in RHOCP 4.
OpenShift Container Platform 4.19 and newer versions
Starting with OpenShift 4.19, cgroup v1 is removed and no longer supported, and it cannot be downgraded to v1.
All the nodes in a cluster must run cgroup v2 before being upgraded to OpenShift 4.19 or a later version. Otherwise the upgrade will be blocked with the following message:
clusteroperator/machine-config is not upgradeable because Cluster is using deprecated cgroup v1, which is removed in 4.19. Please update the ‘cgroupMode’ in the ‘cluster’ object of nodes.config.openshift.io resource type to ‘v2’. This can be changed back to ‘v1’ while on 4.18, but must be ‘v2’ before you update to 4.19. Once updated to 4.19, cgroup v1 is no longer an option. Please refer to https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html-single/nodes/index#nodes-clusters-cgroups-2_nodes-cluster-cgroups-2
OpenShift Container Platform 4.13 to 4.18 versions
The status of cgroup v2 in 4.13 to 4.18 depends on the version:
- Starting with OpenShift 4.13, Linux Control Group version 2 (cgroup v2) is generally available.
- Starting with OpenShift 4.14, Linux Control Groups version 2 is the default, and there is additional information in configuring the Linux cgroup version on your nodes .
- Starting with OpenShift 4.16, Linux Control Groups version 1 is deprecated.
IMPORTANT NOTE: if a cluster is being upgraded, cgroup configuration is not updated automatically.
Changing between cgroup v2 and cgroup v1
To switch to cgroup v2 in a cluster upgraded from RHOCP 4.13 and earlier versions, refer to the documentation for configuring Linux cgroup.
If needed, cgroup can be also changed from version 2 to version 1, only in OpenShift versions older than 4.19.
Root Cause
In any new installation of OpenShift 4.14 or newer releases, the cgroup v2 comes as the default configuration. However, if a cluster is being upgraded from older versions, cgroup is not updated automatically (in any capacity), this is true for all upgrades, and therefore when upgrading to OpenShift 4.19 (which does not support cgroup v1) the upgrade will be blocked.
One notable issue with Machine Config rollout that could impact Machine Config changes (like cgroup version change), getting stuck due to extremely slow node (VM) causing the MachineConfigDaemon to take far too long and get in inconsistent state. For this information about this, please refer to this article.
Diagnostic Steps
-
Check if
cgroupModeis configured in thenodes.configclusterresource:$ oc get nodes.config cluster -o yaml | grep cgroupMode cgroupMode: v2 -
Check if the Machine Configs in use have the configuration for cgroup v2:
$ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-xxxxxxxx True False False 3 3 3 0 3y worker rendered-worker-yyyyyyyy True False False 9 9 9 0 3y $ oc get mc rendered-master-xxxxxxxx rendered-worker-yyyyyyyy -o yaml | grep -A3 "kernelArguments:" kernelArguments: - systemd.unified_cgroup_hierarchy=1 - cgroup_no_v1="all" - psi=0 -- kernelArguments: - systemd.unified_cgroup_hierarchy=1 - cgroup_no_v1="all" - psi=0 -
Check the cgroup configured in the nodes, checking for
cgroup2fsortmpfsin/sys/fs/cgroup:$ for NODE in $(oc get nodes -o name); do echo "------ ${NODE} ------" ; oc debug ${NODE} -q -- chroot /host bash -c "stat -c %T -f /sys/fs/cgroup" ; done ------ node/master-0 ------ cgroup2fs [...]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments