Re-running deploy_cluster.yaml playbook resulted in broken state in RHOCP 3

Solution Verified - Updated -

Environment

  • RedHat Openshift Container Platform 3.11

Issue

  • After following section 7.4 and making changes using Ansible to update identity provider configuration in master-config.yaml, the inventory updated with a new key, and deploy_cluster.yaml playbook is hanging at
FAILED - RETRYING: Wait for all control plane pods to come up and become ready (1 retries left).
failed: [xxxx.node.name] (item=etcd) => {"ansible_loop_var": "item", "attempts": 72, "changed": false, "item": "etcd", "msg": {"cmd": "/usr/bin/oc get pod master-etcd-xxxx.node.name -o json -n kube-system", "results": [{}], "returncode": 1, "stderr": "error: the server doesn't have a resource type \"pod\"\n", "stdout": ""}}

Resolution

  • Run the below mentioned commands in order to resolve the SeLinux issue related to/etc/pki/directory.
$ restorecon -RvF /etc/pki/
$ ausearch -c 'openshift' --raw | audit2allow -M my-openshift
$ semodule -i my-openshift.pp 
  • Check that permissions have been reverted back to original state.
$ ls -lZ /etc/pki
drwxr-xr-x. root root system_u:object_r:cert_t:s0      CA
drwxr-xr-x. root root system_u:object_r:cert_t:s0      ca-trust
drwxr-xr-x. root root system_u:object_r:cert_t:s0      consumer
drwxr-xr-x. root root system_u:object_r:cert_t:s0      entitlement
drwxr-xr-x. root root system_u:object_r:cert_t:s0      java
drwxr-xr-x. root root system_u:object_r:cert_t:s0      nssdb
drwxr-xr-x. root root system_u:object_r:cert_t:s0      nss-legacy
drwxr-xr-x. root root system_u:object_r:cert_t:s0      product
drwxr-xr-x. root root system_u:object_r:cert_t:s0      product-default
drwxr-xr-x. root root system_u:object_r:cert_t:s0      rpm-gpg
drwx------. root root system_u:object_r:cert_t:s0      rsyslog
drwxr-xr-x. root root system_u:object_r:cert_t:s0      tls

Root Cause

  • After rerunning deploy_cluster.yaml as mentioned in 7.4. Making configuration changes using Ansible changed SeLinux configuration of /etc/pki and cluster become degraded.

  • restorecon will restore the default context of a file or directory by reading the default rules set in the SELinux policy.

  • ausearch is a tool that can query the audit daemon logs based for events based on different search criteria. The ausearch utility can also take input from stdin as long as the input is the raw log data.

  • semodule is the tool used to manage SELinux policy modules, including installing, upgrading, listing and removing modules.

Diagnostic Steps

  • Check audit logs from the node for error mentioned below.
$ cat /var/log/audit/audit.log | grep -i "denied"

type=AVC msg=audit(1636396676.044:9611093): avc:  denied  { read } for  pid=5193 comm="openshift" name="ca-bundle.crt" dev="dm-0" ino=19849 scontext=system_u:system_r:container_t:s0:c570,c898 tcontext=system_u:object_r:cert_t:s0 tclass=lnk_file permissive=0
type=AVC msg=audit(1636396676.044:9611094): avc:  denied  { read } for  pid=5193 comm="openshift" name="tls-ca-bundle.pem" dev="dm-0" ino=34137237 scontext=system_u:system_r:container_t:s0:c570,c898 tcontext=system_u:object_r:cert_t:s0 tclass=file permissive=0
type=AVC msg=audit(1636396676.045:9611095): avc:  denied  { read } for  pid=5193 comm="openshift" name="certs" dev="dm-0" ino=19848 scontext=system_u:system_r:container_t:s0:c570,c898 tcontext=system_u:object_r:cert_t:s0 tclass=dir permissive=0
  • Check Selinux permission on /etc/pki.
$ ls -lZ /etc/pki
drwxr-xr-x. root root system_u:object_r:cert_t:s0      CA
drwxr-xr-x. root root system_u:object_r:cert_t:s0      ca-trust
drwxr-xr-x. root root system_u:object_r:cert_t:s0      consumer
drwxr-xr-x. root root system_u:object_r:cert_t:s0      entitlement
drwxr-xr-x. root root system_u:object_r:cert_t:s0      java
drwxr-xr-x. root root system_u:object_r:cert_t:s0      nssdb
drwxr-xr-x. root root system_u:object_r:cert_t:s0      nss-legacy
drwxr-xr-x. root root system_u:object_r:cert_t:s0      product
drwxr-xr-x. root root system_u:object_r:cert_t:s0      product-default
drwxr-xr-x. root root system_u:object_r:cert_t:s0      rpm-gpg
drwx------. root root system_u:object_r:cert_t:s0      rsyslog
drwxr-xr-x. root root system_u:object_r:cert_t:s0      tls

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments