How to redeploy/renew an expired default ingress certificate in RHOCP4

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Default Ingress certificate

Issue

  • There is no alert when default ingress certificate is about to be expired.
  • How to renew or regenerate the expired default ingress certificate.
  • The OpenShift 4 web console is not accessible due to the default ingress certificate being expired and shows the error NET::ERR_CERT_DATE_INVALID.
  • The authentication cluster operator is not available and degraded with the following reasons and messages:

    OAuthServerRouteEndpointAccessibleController_EndpointUnavailable
    
    OAuthServerRouteEndpointAccessibleController_SyncError::RouterCertsDomainValidationController_SyncError::RouterCerts_InvalidServerCertRouterCerts
    
    OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.example.com/healthz": x509: certificate has expired or is not yet valid:current time YYYY-MM-DDTHH:MM:SSZ is after YYYY-MM-DDTHH:MM:SSZ
    
    RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.example.com] -n openshift-authentication: certificate could not validate route hostname oauth-openshift.apps.example.com: x509: certificate has expired or is not yet valid: current time YYYY-MM-DDTHH:MM:SSZ is after YYYY-MM-DDTHH:MM:SSZ
    
  • Unable to login using oc CLI due to certificate has expired:

    $ oc login -u kubeadmin https://api.cluster.example.com:6443
    error: x509: certificate has expired or is not yet valid: current time YYYY-MM-DDTHH:MM:SSZ is after YYYY-MM-DDTHH:MM:SSZ
    

Resolution

Refer to replace the default ingress certificate for replacing the operator-generated certificate, or a custom certificate that is going to expire. If the custom ingress certificate already expired, refer to recovering from expired Openshift-ingress certificates.

IMPORTANT: As per the Ingress certificates documentation:

The Ingress Operator generates a default certificate for an Ingress Controller to serve as a placeholder until you configure a custom default certificate. Do not use Operator-generated default certificates in production clusters.

Alert when the ingress certificate is about to expire

There is currently no alert when the default ingress certificate is near to expire, and an RFE was opened for it: RFE-4269.

It is possible to use Red Hat Advanced Cluster Management for Kubernetes to get information about certificates expiring. For additional information, refer to How to use the Certificate Policy Controller to Identify Risks in Red Hat Advanced Cluster Management for Kubernetes and also Certificate policy controller.

Workaround for the default ingress certificate

Both, the ingress Certificate Authority (CA) and the ingress default certificate signed by this ingress CA, have a validity of 2 years. If the default ingress certificate is expired, then the ingress CA is also crossed its validity, so the ingress CA has to be renewed first followed by the ingress wildcard certificate to avoid the CA mismatch.

Before replacing the CA and the certificate, take the existing secret backups and delete the secret containing the ingress CA and the default ingress certificate from the openshift-ingress-operator and openshift-ingress namespaces respectively. The respective pod restart is required as mentioned below.

Note: This workaround is for the default ingress certificate and if the same was replaced with a custom certificate, the procedure outlined in the documentation has to be followed: Replacing the default ingress certificate.

The following steps have to be executed in the same order. Say, first renew the ingress CA then renew the wild-card certificate using the new ingress CA:

  1. To renew the ingress CA:

    $ oc project openshift-ingress-operator
    $ oc get secret router-ca -oyaml > router-ca.yaml
    $ oc delete secret router-ca
    $ oc delete pod --all
    $ oc get secret router-ca
    $ oc get po
    
  2. To re-create the wild-card ingress certificate using the new ingress CA:

    $ oc project openshift-ingress
    $ oc get secret router-certs-default -o yaml > router-certs-default.yaml
    $ oc delete secret router-certs-default         
    $ oc delete pod --all
    $ oc get secret router-certs-default                
    $ oc get po
    

    Note: If the certificate is not expired yet, to prevent service outage, run the following command instead of oc delete pod --all in step 2.

    $ oc rollout restart deployment/router-default
    

    The ingress certificate and the CA is also referenced in other namespaces like openshift-authentication and these steps will update the ingress cert being referenced in openshift-authentication and other namespaces as well. See the workflow section of the ingress certificate documentation.

  3. After the renewal of default ingress certificates, users may face x509: Certificate signed by unknown authority error while performing $ oc login from the bastion host, in that case, users need to copy the router-ca and add it to the bastion host's trust store.

   $ oc -n openshift-ingress-operator get secret router-ca -o jsonpath="{ .data.tls\.crt }" | base64 -d -i > ingress-ca.crt
   $ cp /root/ingress-ca.crt /etc/pki/ca-trust/source/anchors/
   $ update-ca-trust 

If you are working through lb-int.kubeconfig then oc rsh may not work, in that case, extract the router-ca secret, copy the x509 certificate content from tls.crt(which has been generated from the secret) to your bastion host in a file e.g ingress-ca.crt and then move this file to the /anchors/ directory as explained above and update the trust store.

Root Cause

As per the Ingress certificates - Renewal documentation, the Ingress Operator does not rotate its own signing certificate or the default certificates that it generates. That certificate is not intended for production environments, as the operator-generated default certificates are intended as placeholders for custom default certificates that a user configure.

Diagnostic Steps

Check that there is no custom certificate configured in the ingresscontroller operator:

$ oc get ingresscontroller.operator default -n openshift-ingress-operator -o yaml | grep defaultCertificate
### no output

Check the validity of the ingress certificate:

$ oc project openshift-ingress
$ oc get secret router-certs-default -o yaml | grep crt | awk '{print $2}' | base64 -d | openssl x509 -noout -dates -issuer -subject

Check the validity of the ingress CA:

$ oc project openshift-ingress-operator
$ oc get secret router-ca -oyaml |  grep crt | awk '{print $2}' | base64 -d | openssl x509 -noout -dates -issuer -subject

Describe output of the authentication cluster operator shows a cert expiry:

$ oc describe co/authentication

[...]
Status:
  Conditions:
    Message:               RouterCertsDegraded: secret/v4-0-config-system-router-certs.spec.data[apps.example.com] -n openshift-authentication: certificate could not validate route hostname oauth-openshift.apps.example.com: x509: certificate has expired or is not yet valid: current time YYYY-MM-DDTHH:MM:SSZ is after YYYY-MM-DDTHH:MM:SSZ 

    Reason:                RouterCerts_InvalidServerCertRouterCerts
    Status:                True
    Type:                  Degraded

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments