OAuth API servers are not ready - PreconditionNotReady
Environment
- Red Hat OpenShift Container Platform
- 4.6 and above
Issue
- authentication operator is not available and/or degraded with the message:
- APIServerDeploymentAvailable: no apiserver.openshift-oauth-apiserver pods available on any node.
Resolution
Fixed versions
The issue has been fixed from the following versions:
Version | Related BZ/Jira | Related Errata |
---|---|---|
4.8.2 | BZ 1934400 | RHSA-2021:2438 |
4.7.29 | BZ 1967359 | RHSA-2021:3303 |
4.6.46 | BZ 1967361 | RHBA-2021:3643 |
Workaround
for previous versions, please follow this workaround to get the Authentication Operator back online.
The SCC including the defaultAllowPrivilegeEscalation: false
key/value have been identified in the Diagnostic Steps (step 4).
Use oc edit
to remove the key/value pair from the SCC:
$ KUBECONFIG=~/kubeconfig
$ oc --kubeconfig=${KUBECONFIG} edit scc vulnerability-advisor-scc
#Remove the `defaultAllowPrivilegeEscalation: false` line
Once all SCC configurations have been updated, refresh the openshift-oauth-apiserver
replicaset.
$ oc patch replicaset.apps/apiserver-5bdfd49cf8 -n openshift-oauth-apiserver -p '{"spec": {"replicas": 0}}' --kubeconfig=${KUBECONFIG}
This will trigger the replicaset to redeploy the PODs, as it will automatically recover the value to 3.
Once the replicaset has started at least one pod, the Authentication
operator should have recover and the command oc login
should work.
If the replicaset is still failing, please check the diagnostic steps again for any extra SCC to edit.
Root Cause
The issue is related to a SecurityContextConstraint (SCC) setting the defaultAllowPrivilegeEscalation: false
when the openshift-oauth-apiserver
replicaset is trying to start the pods with the privileged
option.
A Bug fix has been submitted and should be released in a future release version (current version: 4.6.19, 4.7.0)
Diagnostic Steps
Disclaimer: This issue will only occur on RHCOP cluster 4.6 and above, as the openshift-oauth-apiserver
namespace was added from RHOCP 4.6 .
To access your cluster, you need to use the kubeconfig created during the installation process.
1) The first issue should be visible from the Authentication
operator:
$ KUBECONFIG=~/kubeconfig
$ oc --kubeconfig=${KUBECONFIG} get co authentication -o json | jq -r '.status.conditions[] | select(.type == "Degraded")'
{
"lastTransitionTime": "2021-03-05T00:59:10Z",
"message": "APIServerDeploymentDegraded: 3 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (no pods found with labels \"apiserver=true,app=openshift-oauth-apiserver,oauth-apiserver-anti-affinity=true,revision=2\")\nOAuthServerDeploymentDegraded: Unable to get \"openshift-browser-client\" bootstrapped OAuth client: the server is currently unable to handle the request (post oauthclients.oauth.openshift.io)",
"reason": "APIServerDeployment_UnavailablePod::OAuthServerDeployment_GetFailed",
"status": "True",
"type": "Degraded"
}
Here the Authentication
operator is complaining about the missing pods from the openshift-oauth-apiserver
namespace
2) Looking at the objects available in the openshift-oauth-apiserver
namespace:
$ oc --kubeconfig=${KUBECONFIG} get all -n openshift-oauth-apiserver
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/api ClusterIP 172.30.201.252 <none> 443/TCP 161m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/apiserver 0/3 0 0 158m
NAME DESIRED CURRENT READY AGE
replicaset.apps/apiserver-5bdfd49cf8 3 0 0 158m
PODs are missing in the replicaset (Desired: 3, Current: 0).
3) Looking at deeper in the replicatset will show the actual issue:
$ oc --kubeconfig=${KUBECONFIG} get rs -n openshift-oauth-apiserver -o json | jq -r '.items[].status'
{
"conditions": [
{
"lastTransitionTime": "2021-03-05T00:57:10Z",
"message": "Pod \"apiserver-5bdfd49cf8-h8dtb\" is invalid: [spec.containers[0].securityContext: Invalid value: core.SecurityContext{Capabilities:(*core.Capabilities)(nil), Privileged:(*bool)(0xc02cf74aae), SELinuxOptions:(*core.SELinuxOptions)(nil), WindowsOptions:(*core.WindowsSecurityContextOptions)(nil), RunAsUser:(*int64)(nil), RunAsGroup:(*int64)(nil), RunAsNonRoot:(*bool)(nil), ReadOnlyRootFilesystem:(*bool)(nil), AllowPrivilegeEscalation:(*bool)(0xc02cf7488c), ProcMount:(*core.ProcMountType)(nil), SeccompProfile:(*core.SeccompProfile)(nil)}: cannot set `allowPrivilegeEscalation` to false and `privileged` to true, spec.initContainers[0].securityContext: Invalid value: core.SecurityContext{Capabilities:(*core.Capabilities)(nil), Privileged:(*bool)(0xc02cf74aad), SELinuxOptions:(*core.SELinuxOptions)(nil), WindowsOptions:(*core.WindowsSecurityContextOptions)(nil), RunAsUser:(*int64)(nil), RunAsGroup:(*int64)(nil), RunAsNonRoot:(*bool)(nil), ReadOnlyRootFilesystem:(*bool)(nil), AllowPrivilegeEscalation:(*bool)(0xc02cf7488c), ProcMount:(*core.ProcMountType)(nil), SeccompProfile:(*core.SeccompProfile)(nil)}: cannot set `allowPrivilegeEscalation` to false and `privileged` to true]",
"reason": "FailedCreate",
"status": "True",
"type": "ReplicaFailure"
}
],
"observedGeneration": 11,
"replicas": 0
}
The replicaset is complaining about an incoherence as it "cannot set allowPrivilegeEscalation
to false and privileged
to true"
4) This is related to a defaultAllowPrivilegeEscalation: false
key/value pair present in one of the SecurityContextConstraint (SCC) config.
Running the following command should identify the SCC with this key/value pair:
$ oc --kubeconfig=${KUBECONFIG} get scc -o json | jq -r '.items[] | select(.defaultAllowPrivilegeEscalation == false) | .metadata.name'
vulnerability-advisor-scc
In this example, the SCC vulnerability-advisor-scc
has the key/value pair defined in its configuration.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments