How to redeploy/renew an expired default ingress certificate in OCP 4

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Default Ingress certificate

Issue

  • The OpenShift 4 web console is not accessible due to the default ingress certificate is expired.
  • How to renew or regenerate the expired default ingress certificate.
  • The authentication cluster operator is degraded with RouterCerts_InvalidServerCertRouterCerts and a cert expiry message:

    RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.example.com] -n openshift-authentication: certificate could not validate route hostname oauth-openshift.apps.example.com: x509: certificate has expired or is not yet valid: current time YYYY-MM-DDTHH:MM:SSZ is after YYYY-MM-DDTHH:MM:SSZ
    
  • Unable to login using oc CLI due to certificate has expired:

    $ oc login -u kubeadmin https://api.cluster.example.com:6443
    error: x509: certificate has expired or is not yet valid: current time 2021-09-23T09:46:28+01:00 is after 2021-08-20T20:16:38Z
    

Resolution

IMPORTANT: As per the Ingress certificates documentation:

The Ingress Operator generates a default certificate for an Ingress Controller to serve as a placeholder until you configure a custom default certificate. Do not use Operator-generated default certificates in production clusters.

The supported configuration is to replace the default ingress certificate with a custom certificate.

Workaround

Both, the ingress Certificate Authority (CA) and the ingress default certificate signed by this ingress CA, have a validity of 2 years. If the default ingress certificate is expired, then the ingress CA is also crossed its validity, so the ingress CA has to be renewed first followed by the ingress wildcard certificate to avoid the CA mismatch.

Before replacing the CA and the certificate, take the existing secret backups and delete the secret containing the ingress CA and the default ingress certificate from the openshift-ingress-operator and openshift-ingress namespaces respectively. The respective pod restart is required as mentioned below.

Note: This workaround is for the default ingress certificate and if the same was replaced with a custom certificate, the procedure outlined in the documentation has to be followed: Replacing the default ingress certificate.

The following steps have to be executed in the same order. Say, first renew the ingress CA then renew the wild-card certificate using the new ingress CA:

  1. To renew the ingress CA:

    $ oc project openshift-ingress-operator
    $ oc get secret router-ca -oyaml > router-ca.yaml
    $ oc delete secret router-ca
    $ oc delete pod --all
    $ oc get secret router-ca
    $ oc get po
    
  2. To re-create the wild-card ingress certificate using the new ingress CA:

    $ oc project openshift-ingress
    $ oc get secret router-certs-default -o yaml > router-certs-default.yaml
    $ oc delete secret router-certs-default         
    $ oc delete pod --all
    $ oc get secret router-certs-default                
    $ oc get po
    

The ingress certificate and the CA is also referenced in other namespaces like openshift-authentication and these steps will update the ingress cert being referenced in openshift-authentication and other namespaces as well. See the workflow section of the ingress certificate.

  1. After the renewal of default Ingress certificates, users may face x509: Certificate signed by unknown authority error while performing $ oc login from the bastion host, in that case, users need to copy the router-ca and add it to the bastion host's trust store.
   $ oc -n openshift-ingress-operator get secret router-ca -o jsonpath="{ .data.tls\.crt }" | base64 -d -i > ingress-ca.crt
   $ cp /root/ingress-ca.crt /etc/pki/ca-trust/source/anchors/
   $ update-ca-trust 

If you are working through lb-int.kubeconfig then oc rsh may not work, in that case, extract the router-ca secret, copy the x509 certificate content from tls.crt(which has been generated from the secret) to your bastion host in a file e.g ingress-ca.crt and then move this file to the /anchors/ directory as explained above and update the trust store.

Root Cause

As per the Ingress certificates documentation, the Ingress Operator does not rotate its own signing certificate or the default certificates that it generates. That certificate is not intended for production environments, as the operator-generated default certificates are intended as placeholders for custom default certificates that a user configure.

Diagnostic Steps

Check that there is no custom certificate configured in the ingresscontroller operator:

$ oc get ingresscontroller.operator default -n openshift-ingress-operator -o yaml | grep defaultCertificate
### no output

Check the validity of the ingress certificate:

$ oc project openshift-ingress
$ oc get secret router-certs-default -o yaml | grep crt | awk '{print $2}' | base64 -d | openssl x509 -noout -dates -issuer -subject

Check the validity of the ingress CA:

$ oc project openshift-ingress-operator
$ oc get secret router-ca -oyaml |  grep crt | awk '{print $2}' | base64 -d | openssl x509 -noout -dates -issuer -subject

Describe output of the authentication cluster operator shows a cert expiry:

$ oc describe co/authentication

[...]
Status:
  Conditions:
    Message:               RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.example.com] -n openshift-authentication: certificate could not validate route hostname oauth-openshift.apps.example.com: x509: certificate has expired or is not yet valid: current time YYYY-MM-DDTHH:MM:SSZ is after YYYY-MM-DDTHH:MM:SSZ 

    Reason:                RouterCerts_InvalidServerCertRouterCerts
    Status:                True
    Type:                  Degraded

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

4 Comments

although router pods restart quite fast .... wouldn't it make more sense to delete the router pods one by one to prevent a total service outage?

Yes, make sense if the certificate is still not expired or for keeping the connections to insecure and passthrough routes. But the specific case described here is related to router pods not being able to work properly.

Can we extend ingress certificate more than 2 year?

No, the default validity of the default ingress CA-signed certificate is 2 years and it cannot be modified. Considering a public/customCA signed cert is the only option to have more validity, but it depends on the Certificate issuer as recently the default validity of any digital certificate is reduced to have only 398 days of validity.