Recovering from kube-apiserver pod crashloop from bad certificates leading to cluster outage (OCP4)

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (OCP) 4.x

Issue

  • During automatic roll-out of certificates, it may be possible to find yourself in a situation in which you have lost access to the cluster due to empty secrets being published and the cluster cannot automatically recover.
  • Application of certificates with null content can lead to kube-apiserver pods crash-looping on all 3 master nodes, leading to loss of oc commands and failure to recover via normal bypass procedures.
  • kube-apiserver pods are crashing with the following error logged when viewed via crictl logs <kube-apiserver-container-id>:
I1130 17:14:52.083740 18 server.go:220] Version: v1.21.11+5cc9227
I1130 17:14:52.084239 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "serving-cert::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key"
I1130 17:14:52.084436 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.key"
I1130 17:14:52.084712 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key"
I1130 17:14:52.084990 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.key"
I1130 17:14:52.085274 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.key"
I1130 17:14:52.085541 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.crt::/etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.key"
Error: failed to load SNI cert and key: tls: failed to parse private key
I1130 17:14:52.088051 1 main.go:198] Termination finished with exit code 1
I1130 17:14:52.088067 1 main.go:151] Deleting termination lock file "/var/log/kube-apiserver/.terminating"

Resolution

  • It is advisable to open a support case with Red Hat to guide you through this procedure. If you aren't sure about any of the steps below or want to be sure it is being handled correctly. Reference this KCS in your support case submission.

  • First, gain access to a master node via SSH. This node will act as your recovery interface for the duration of this procedure. Do not reboot this host.

  • Secondly, scp a valid KUBECONFIG file (preferably the one used during installation) as it will grant you access to the cluster without requiring to go through API cert handling.
  • We will need a valid etcd backup file archive, from which we will be repulling prior certs and secrets. Alternatively, you may generate a new cert/key combination as per our docs for your api platform.

  • Use crictl ps -a | grep kube-api to pull the containers that need to be online

...output omitted...
27c100bdeb4bf       95c3939b4d8450038300bf19989ea9a89fe6f0bd0fdfc1acebb60c54bbfb36d2                                                2m          exited             kube-apiserver-cert-syncer                    0                   c66f7052984d9
ef71049da05c9       7d69cb835eec5ad586949617c3df4c1e8124fef1f9232f7086656ff2141dda0f                                                3m          exited             kube-apiserver                                0                   c66f7052984d9
  • If we see that these pods are crash-looping, review the logs of the latest crashed container for kube-apiserver:

oc logs ef71049da05c9

I1130 17:14:52.083740 18 server.go:220] Version: v1.21.11+5cc9227
I1130 17:14:52.084239 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "serving-cert::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key"
I1130 17:14:52.084436 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.key"
I1130 17:14:52.084712 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key"
I1130 17:14:52.084990 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.key"
I1130 17:14:52.085274 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.crt::/etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.key"
I1130 17:14:52.085541 18 dynamic_serving_content.go:111] Loaded a new cert/key pair for "sni-serving-cert::/etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.crt::/etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.key"
Error: failed to load SNI cert and key: tls: failed to parse private key
I1130 17:14:52.088051 1 main.go:198] Termination finished with exit code 1
I1130 17:14:52.088067 1 main.go:151] Deleting termination lock file "/var/log/kube-apiserver/.terminating"
  • It is important to understand that in the above log, the error message is NOT reporting which key failed to parse, it is reporting successful import and then a failure on the last key. However, it does not specify WHICH key is failing. (An RFE is in progress to resolve this alerting failure in future builds).

  • Looking at a healthy cluster container, we can see that the target key that is being referenced is the following:

  • List the certFile/keyfile combos being listed:

oc exec -n openshift-kube-apiserver -c kube-apiserver $POD cat /etc/kubernetes/static-pod-resources/configmaps/config/config.yaml | jq

$POD=any running kube-apiserver pod in -n openshift-kube-apiserver

"namedCertificates": [
      {
        "certFile": "/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.crt",
        "keyFile": "/etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.key"
      },
      {
        "certFile": "/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt",
        "keyFile": "/etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key"
      },
      {
        "certFile": "/etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.crt",
        "keyFile": "/etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.key"
      },
      {
        "certFile": "/etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.crt",
        "keyFile": "/etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.key"
      },
      {
        "certFile": "/etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.crt",
        "keyFile": "/etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.key"
      }
      {
        "certFile": "/etc/kubernetes/static-pod-certs/secrets/user-serving-cert-000/tls.crt", ##<----- THIS CERT IS FAILING
        "keyFile": "/etc/kubernetes/static-pod-certs/secrets/user-serving-cert-000/tls.key",  ##<----- THIS KEY IS FAILING 
        "names": [
          "api.mycluster.mydomain.com"
        ]
      }
    ]
  • Therefore, the target folder we're interested in reviewing on the nodes is this one:
    /etc/kubernetes/static-pod-resources/kube-apiserver-certs/secrets/user-serving-cert-000 which is being mounted into the pod at /etc/kubernetes/static-pod-certs/secrets/user-serving-cert-000/

  • Open your etcd backup and locate etc/kubernetes/static-pod-resources/kube-apiserver-certs/user-serving-cert-000/tls.key --> ensure that this file (or whatever other last attempted to locate cert file prior to exit alert) exists on the node. Copy it in directly to the target location, with file permissions 0600 owner:group --> root:root

  • If this backup file is unavailable, generate a new key/cert combo as per our documentation. Where you would be creating a secret from a certificate and keyfile, instead, place these files directly into the target folder on the host at /etc/kubernetes/static-pod-resources/kube-apiserver-certs/user-serving-cert-000/* as tls.key and tls.crt.

  • Once this cert/key combo is in place, the kube-apiserver containers should come up, and the api will become available ONLY ON THIS HOST.

  • Once Kube-apiserver is up, you can locally export KUBECONFIG=/path/to/kubeconfig

  • oc commands should now be available, but you may need to apply --insecure-skip-tls-verify to your commands since your node likely will not have your local CA exported.

oc get nodes --insecure-skip-tls-verify --> you can append this value to your exported kubeconfig to bypass this requirement:

 clusters:
 - cluster:
     insecure-skip-tls-verify: true ##<---- add here
     server: https://api.mycluster.mydomain.com:6443
   name: api.mycluster.mydomain:6443
  • You will need to re-run the same fix outlined above on all 3 master nodes to bring the kube-apiserver containers up on all 3 hosts. After doing so, you will need to re-push a new secret for the apiserver and deploy it as per our docs:
$ oc create secret tls <secret> \
     --cert=</path/to/cert.crt> \
     --key=</path/to/cert.key> \
     -n openshift-config
  • As needed (if different than before) run the below patch to ensure that your apiserver is pointing at the new secret name (if unchanged and you've just repaired a failed cert, you can ignore this step)
# Update the API server to reference the created secret:

$ oc patch apiserver cluster \
     --type=merge -p \
     '{"spec":{"servingCerts": {"namedCertificates":
     [{"names": ["<FQDN>"], 
     "servingCertificate": {"name": "<secret>"}}]}}}' 
  • Validate that the apiserver cluster object is referencing the secret:
$ oc get apiserver cluster -o yaml
  • Wait for the openshift-kube-apiserver pods to redeploy to the next revision. (This may take some time, be patient).

  • Once all 3 pods come up successfully, bastion access should be available once again. It is highly advisable to run a validation check and re-deploy these pods again to ensure that they stay up and respect your new certificates:

oc patch kubeapiserver cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge

  • The above patch is a null update that just appends the date as a logged change, and forces a new revision without modifying the actual deployment further - this will confirm that your pods will survive node reboots and that things are operating as expected after recovery efforts concluded.

Root Cause

  • Deployment of certificates should be done as a merge/update to replace existing certificates. If steps were taken to delete and then not replace existing certs, it is possible to find yourself in a degraded cluster state with no way to recover. (deleting secrets without unpatching cluster to require the secret).
  • Creating new secrets with null content or degraded content can also result in similar behavior, as the pods attempt to read data from cert/key combos that are not readable or are empty after secret creation.

Diagnostic Steps

  • Lost access to oc commands
  • SSH to control plane nodes does not allow bypass with exported kubeconfig
  • curl -kv https://127.0.0.1:6443/version is rejected/unavailable when run from master nodes
  • crictl ps -a | grep kube-apiserver displays that the kube-apiserver containers are crash-looping on all master nodes

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments