Checking Certificate Expiration for OpenShift Container Platform

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Enterprise (OSE) 3.0, 3.1, 3.2
  • Red Hat OpenShift Container Platform (OCP) 3.3 or later

Issue

  • How to replace CA and regenerate other cert files in OpenShift Enterprise 3?
  • When are my OpenShift Cluster's certificates going to expire?
  • Are my certificates expired/expiring?
  • Is there a way to check on the health of my OpenShift certificates?
  • It looks like our OpenShift etcd peer certificates are expired.
  • OpenShift cluster is down due to expired etcd certificates. We tried to renew the certs by running both etcd CA certs and etcd certs. Both tasks seem to have updated the certs but etcd restart is failing with bad certificates.

Resolution

With OpenShift 3.4 (RHSA-2017:0448 Security Update), we now ship with an Ansible OpenShift Role/Playbook that can be used to help check the status of the X509 Certificates that are used within the OpenShift cluster for internal communications.

Note: Backports to openshift-ansible-3.2.53-1 and openshift-ansible-3.3.67 were also shipped with the above errata.

Checking the health / validity of your certificates on a regular basis (or having a planned point in time for when to update them) is highly recommended!

The information contained in this document aims to provide tools / information that can help you spot certificates that are about to expire, to avoid cluster-wide outages.

Note: This will not check application certificates or certificates provided for applications, such as the router default certificate, or certificates provided to routes.

  1. Run the playbook:
    For OCP < 3.9:
    $ ansible-playbook -v -i /usr/share/ansible/openshift-ansible/playbooks/certificate_expiry/easy-mode.yaml
    For OCP >= 3.9:
    $ ansible-playbook -v -i /usr/share/ansible/openshift-ansible/playbooks/openshift-checks/certificate_expiry/easy-mode.yaml

    There are seven different variables that you can set in the playbook, and the defaults that are used if you do not specify anything:

    Variable Default Value Description
    openshift_certificate_expiry_config_base /etc/origin This Checks for certificates in the specified directory
    openshift_certificate_expiry_warning_days 30 Flag certificates which will expire in this many days from now
    openshift_certificate_expiry_show_all no Include healthy (non-expired and non-warning) certificates in results
    openshift_certificate_expiry_generate_html_report no Generate an HTML report of the expiry check results
    openshift_certificate_expiry_html_report_path /tmp/cert-expiry-report.html The full path to save the HTML report as
    openshift_certificate_expiry_save_json_results no Generate a JSON report of the expiry check results
    openshift_certificate_expiry_json_results_path /tmp/cert-expiry-report.json The full path to save the JSON report as
  2. There are two ways to receive output from this role:

    • Add the -v option when running the ansible-playbook command:
      For OCP < 3.9:
      $ ansible-playbook -v -i -v /usr/share/ansible/openshift-ansible/playbooks/certificate_expiry/easy-mode.yaml
      For OCP >= 3.9:
      $ ansible-playbook -v -i -v /usr/share/ansible/openshift-ansible/playbooks/openshift-checks/certificate_expiry/easy-mode.yaml

    • Specify an output using the openshift_certificate_expiry_generate_html_report and/or openshift_certificate_expiry_save_json_results options.


Updating Expired Certificates or Certificates that will expire

Should you find certificates that are about to expire or are expired, the certificates will need to be updated. The process for this is described in the official OpenShift documentation.

Depending on the version of OpenShift, the process may be different:

  • For OpenShift 3.0 and 3.1, please open a case with Red Hat Support for assistance with the manual procedure.
  • For OpenShift 3.10, 3.11, consult Knowledge Solution 3782361 for the manual procedure.

Root Cause

The idea of this Role is going through each system in the Ansible Hosts file and checking every certificate for the validity of all of these. A more manual method for doing this is to, on each system, find all of the certificates and manually check each one's validity.

Diagnostic Steps

Note: If using a version of OpenShift before 3.2, use the manual steps provided here

Here is an example of commands to manually check these certificates:

for cert in $(for config in $(find /etc/origin/ -name "*yaml"); do file=$(basename $config); awk '/.crt/ { print FILENAME $2 }' $config | sed "s/$file//"; done); do echo $cert; openssl x509 -in $cert -text -noout | grep Validity -A2; done

for config in $(find /etc/origin/ -name "*kubeconfig"); do echo "Config: $config";  file=$(basename $config); echo "  File: $file"; awk '/cert/ {print $2}' $config | sed "s/$file//" | base64 -d | openssl x509 -text -noout | grep Validity -A2 ; done

This example locates all of the certificates on a single system (much like using openshift_certificate_expiry_show_all option) and then prints the name/location of the certificate and it validity range. It would provide effectively the same output as the role, if it were only run on a single node.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments