Manually create the master and API certificates when API is down and redeploy-certificates playbook fails
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3.11
Issue
- The
redeploy-certificatesplaybook fails due to the already expired API certificates. - Master node certificates are expired.
Resolution
Info
The playbooks/redeploy-certificates.yml or playbooks/openshift-master/redeploy-certificates.yml playbooks fail, because the playbook checks whether the API is up and accessible. If all masters and it's API is down, the playbook fails, because it checks the API against the Load Balancer API URL, instead the API running on each master.
Almost all of the certificates present inside /etc/origin/master/ can be regenerated manually with openssl to bring the API up.
NOTE: The oc adm command to generate the certificate needs to be run on the first master node where the /etc/origin/master/ca.serial.txt file is present.
-
The
/etc/origin/master/master.server.crtis the API server certificate which is required to keep the API running. The common name(CN)andSubject Alternative Nameneeds to be obtained from the expired certificate as the hostnames and IP address are required at the time of certificate creation.# openssl x509 -in /etc/origin/master/master.server.crt -text -noout Certificate: Data: ..... ..... ..... Subject: CN=10.74.249.116 ..... ..... X509v3 Subject Alternative Name: DNS:external.example.com, DNS:internal.example.com, DNS:master-1.example.com, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:openshift, DNS:10.74.249.116, DNS:172.30.0.1, IP Address:10.74.249.116, IP Address:172.30.0.1 -
Generate the new API server certificate with the
oc admcommand by specifying the hostnames and IP address retrieved from the previous step. This step and the previous step needs to be performed for each master node individually.# oc adm ca create-server-cert --signer-cert=/etc/origin/master/ca.crt --signer-key=/etc/origin/master/ca.key --signer-serial=/etc/origin/master/ca.serial.txt --hostnames='external.example.com,internal.example.com,master-1.example.com,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster.local,openshift,10.74.249.116,172.30.0.1' --cert=/etc/origin/master/master.server.crt --key=/etc/origin/master/master.server.key -
The
/etc/origin/master/openshift-master.crthas to be generated for each master node individually and after generating the certificate and key, both need to be added to/etc/origin/master/openshift-master.kubeconfigin base64 encoded format on all master nodes respectively.# openssl genrsa -out /etc/origin/master/openshift-master.key 2048 # cat extension.ext keyUsage = critical,digitalSignature,keyEncipherment extendedKeyUsage = clientAuth basicConstraints = critical,CA:false # openssl req -new -key /etc/origin/master/openshift-master.key -subj "/O=system:masters/O=system:openshift-master/CN=system:openshift-master" -out /etc/origin/master/openshift-master.csr # openssl x509 -req -in /etc/origin/master/openshift-master.csr -CA /etc/origin/master/ca.crt -CAkey /etc/origin/master/ca.key -CAcreateserial -out /etc/origin/master/openshift-master.crt -days 730 -sha256 -extfile extension.ext -
Encode the
/etc/origin/master/openshift-master.crtand/etc/origin/master/openshift-master.keydata into base64 format and replace the existing data inside/etc/origin/master/openshift-master.kubeconfigwith the new one.# cat /etc/origin/master/openshift-master.crt | base64 -w 0 LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JURKakNDQWc2Z0F3SUJBZ0lKQU5aTXFTOFl4RnNHTUEwR0NTcUdTSWIzRFFFQkN3VUFNQkV4RHpBTkJnTlYKQkFNVEJuSnZiM1JEUVRBZUZ3MHlNREExTXpFd09EQTNNVGRhRncweU1qQTFNekV3T0RBM01UZGFNRjB4RnpBVgpCZ05WQkFvTURuTjVjM1JsYlRwdFlYTjBaWEp6TVNBd0hnWURWUVFLREJkemVYTjBaVzA2YjNCbGJuTm9hV1owCkxXMWhjM1JsY2pFZ01CNEdBMVVFQXd3WGMzbHpkR1Z0T205d1pXNXphR2xtZEMxdFl..... # cat /etc/origin/master/openshift-master.key | base64 -w 0 LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3mR0MnN1TU5EZ0hmY3hxOGpZcUh3cmk5SXNIeEtDNnBVCmJXTjRxR25iZkNRRVZUeHNMRFp2RFdoeE5zZjFXN29nTlRpM20xb2VXQmpPQklQVE9RTTZyczJGQWtKSFNPdGQKRlQyK21YaGJYaFhodkZ..... # cat /etc/origin/master/openshift-master.kubeconfig apiVersion: v1 clusters: ... ... client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JURKakNDQWc2Z0F3SUJBZ0lKQU5aT..... client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3mR0MnN1TU5EZ0hmY3hxOGpZ..... -
The
/etc/origin/master/master.kubelet-client.crtkey-pair will remain the same on all the master nodes, so it can be generated on any of the master nodes and copy on others.# openssl genrsa -out /etc/origin/master/master.kubelet-client.key 2048 # openssl req -new -key /etc/origin/master/master.kubelet-client.key -subj "/O=system:node-admins/CN=system:openshift-node-admin" -out /etc/origin/master/master.kubelet-client.csr # Use the same extension file from the previous step. # openssl x509 -req -in /etc/origin/master/master.kubelet-client.csr -CA /etc/origin/master/ca.crt -CAkey /etc/origin/master/ca.key -CAcreateserial -out /etc/origin/master/master.kubelet-client.crt -days 730 -sha256 -extfile extension.ext -
The
/etc/origin/master/master.proxy-client.crtkey-pair will also remain the same on all the master nodes, so it can be generated on any of the master nodes and copy on others.# openssl genrsa -out /etc/origin/master/master.proxy-client.key 2048 # openssl req -new -key /etc/origin/master/master.proxy-client.key -subj "/CN=system:master-proxy" -out /etc/origin/master/master.proxy-client.csr # Use the same extension file from the previous step. # openssl x509 -req -in /etc/origin/master/master.proxy-client.csr -CA /etc/origin/master/ca.crt -CAkey /etc/origin/master/ca.key -CAcreateserial -out /etc/origin/master/master.proxy-client.crt -days 730 -sha256 -extfile extension.ext -
Now, wait for a few minutes until the API comes up. The above-created certificates are sufficient to bring the API up but if any other certificates are also expired on master nodes then those can be created manually or by running the playbook
/usr/share/ansible/openshift-ansible/playbooks/openshift-master/redeploy-certificates.yml. -
The playbook will run properly now as the API is recovered.
NOTE - make sure to run the playbook afterwards, as the guide only recovers the master certificates. The playbook not only redeploys new certificates, but also restarts necessary services as web console and other.
Start the control plane manually
Due to the API being down, the hyperkube won't start the master services automatically. To do so, you can start the services manually with the docker command.
- check the previous running containers
# docker ps -a | grep master-api
a71000045a3a 51f70394a454 "/bin/bash -c '#!/..." 4 days ago Exited (2) 2 days ago k8s_api_master-api-my-cluster_kube-system_1ab1ce8dbbe107a24e4a04ff31f706fb_0
31156342d174 registry.redhat.io/openshift3/ose-pod:v3.11.420 "/usr/bin/pod" 4 days ago Exited (0) 2 days ago k8s_POD_master-api-my-cluster_kube-system_1ab1ce8dbbe107a24e4a04ff31f706fb_0
- start the containers in order to start first the POD container and then the actual container (in case above from bottom to top).
# docker start 31156342d174
# docker start a71000045a3a
- check if the pods are running (2 containers should be running)
# docker ps | grep master-api
For the other masters, the hyperkube should be restarted to start the control plane services as at least 1 API server is up.
Root Cause
-
The API server certificate and other master node certificates were already expired or near the expiry date due to which the
redeploy-certificates.ymlplaybook failed. -
The Playbook checks if the API is running against the Load balancer URL. The playbook expects that at least 1 master is up and running correctly.
Diagnostic Steps
-
Check the expiry date of the certificates by running the
playbooks/openshift-checks/certificate_expiry/easy-mode.yamlplaybook which will generate a JSON and HTML report. -
Manually check the expiry date of all the certificates by following the How to list all OpenShift TLS certificate expire date.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments