My OpenShift 4.11 workloads fail with "Operation not permitted"
Environment
- Red Hat OpenShift Container Platform 4.11
Issue
- When a pod gets created, after it is scheduled to a node (and started) it reports errors in the pod logs containing an error message "Operation not permitted". This happens when I try to run my workload in OpenShift 4.11, in previous versions it was working fine.
Resolution
There are two possible options to resolve this issue:
The first option is 'generic' and should be done to ensure that the issue described in this solution is really what is blocking your workload from normal operation, and that you only provide the security capabilities you need and not more than that. In short:
- Inspect the workloads and assess what the required Linux capabilities are for the workload.
- Add these required capabilities to the securityContext of your workloads container definition, or select/connect an SCC to your workload that offers/provide you the usage of the capabilities you require.
- Coordinate with the cluster admin on having access to an SCC that allows you to set the capabilities you need.
The second option (below), is the most likely solution customer will take (as it lets you use what OCP provides) vs having to define what you want explicitly.
- Note: This option is functionally equivalent to saying my container can run with all capabilities and my not meet the security posture your company/cluster is wanting to provide.
- Follow the steps from KB article 6973044 and assign the ServiceAccount that runs the pods the ability to "use" the "restricted" SCC
- Make sure your wokload matches the "restricted" SCC but not "restricted-v2" - e.g. by adding
allowPrivilegeEscalationto the container securityContext
- Make sure your wokload matches the "restricted" SCC but not "restricted-v2" - e.g. by adding
Root Cause
OpenShift 4.11 introduced the "restricted-v2" SCC in place of the "restricted" SCC known from previous versions. This change was made to adjust security to the current Pod security standards.
The new "restricted-v2" SCC drops "ALL" capabilities from a container (compared to the "restricted" SCC that dropped only a subset), and as a result of this, workloads created in OpenShift 4.11 might fail for a lack of permissions to perform certain operations.
- Note: Clusters that upgrade from 4.10 or prior versions to 4.11 still have or may use the "restricted" SCC (and may not be affected by this issue directly), thus it's important to understand/map what SCC the pod is using, to confirm this is the issue your seeing. Customers migrating workloads (either manually or with Red Hat's Migration Toolkit for Containers) from OCP versions prior to 4.11 to 4.11 clusters are likely to see/experience this (as the "restricted-v2" SCC is the default on newly installed 4.11 clusters).
Diagnostic Steps
- Run your workload (on a 4.11 cluster)
- Wait for a pod to be created and running.
- Test the functionality of the pod/service (to ensure it does all the operations you expect).
- If the pod fails (to preform an operation/capability it offers), check its logs with
oc logs -n <your_namespace> <podname> -c <name_of_a_pod_container> - If you find logs stating "Operation not permitted" and if your pod was running fine in previous OpenShift versions, there is a good chance you are affected.
- If the pod fails (to preform an operation/capability it offers), check its logs with
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments