In RHOCP 3.11 pods are in "ContainerCreating" state

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform
    • 3.11

Issue

  • Following an RHOCP upgrade, most of the application pods from project samplenamespace were in ContainerCreating state.

Resolution

  • Statefulsets can't be edited. Hence, create a new statefulset with desired secret and deploy it. Once it is successfully deployed and new pods are created, remove the older pods. Use below steps for reference:
$ oc get sts mypod -oyaml > mypodbackup.yaml
$ vim mypodbackup.yaml
-------------8<-------------
specs:
  volumes:
       - name: mysql
         secret:
           secretName: mysql
------------->8-------------
  • Using the above yaml file create the new statefulset as follows:
$ oc create -f mypodbackup.yaml
  • Check the status of new pods created by the new statefulset using below command:
$ oc get pods 
NAME      READY     STATUS              RESTARTS   AGE
mypod     0/1       ContainerCreating   0          28m
mypod-1   1/1       Running             0          28m
  • Delete old application pods using below command:
$ oc delete pods mypod

Root Cause

  • This is a configuration issue in statefulset used by the application pod. It was referring to incorrect secret called service-certs. Correct secrets were my-pod opaque secret.

Diagnostic Steps

  • Check pod status using below command:
$ oc get pods 
NAME    READY     STATUS              RESTARTS   AGE
mypod   0/1       ContainerCreating   0          28m
  • Check the pod description using below command:
$ oc describe pod mypod
Name:               mypod
Namespace:          samplenamespace
Priority:           0
PriorityClassName:  <none>
Node:               workernode1/10.129.x.x
Start Time:         Tue, 15 Sep 2020 05:58:35 -0400
Labels:             application=mypod
                    controller-revision-hash=mypod-67f58bf7f
                    deploymentConfig=mypod
                    statefulset.kubernetes.io/pod-name=mypod
Annotations:        kubernetes.io/limit-ranger=LimitRanger plugin set: cpu limit for container mypod
                    openshift.io/scc=restricted
Status:             Pending
IP:
Controlled By:      StatefulSet/mypod
Containers:
  mypod:
    Container ID:
    Image:          registry.redhat.io/jboss-datagrid-7/datagrid73-openshift
    Image ID:
    Ports:          8443/TCP, 8888/TCP, 11222/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     1
      memory:  512Mi
    Requests:
      cpu:      500m
      memory:   512Mi
    Liveness:   exec [/opt/datagrid/bin/livenessProbe.sh] delay=15s timeout=10s period=20s #success=1 #failure=5
    Readiness:  exec [/opt/datagrid/bin/readinessProbe.sh] delay=17s timeout=10s period=10s #success=1 #failure=5
    Environment:
      SERVICE_NAME:                     mypod
      SERVICE_PROFILE:                  datagrid-service
      JGROUPS_PING_PROTOCOL:            openshift.DNS_PING
      OPENSHIFT_DNS_PING_SERVICE_NAME: mypod-ping
      USERNAME:                         <set to the key 'application-user' in secret 'mypod'>      Optional: false
      PASSWORD:                         <set to the key 'application-password' in secret 'mypod'>  Optional: false
    Mounts:
      /opt/datagrid/standalone/data from srv-data (rw)
      /var/run/secrets/java.io/keystores from keystore-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-gn7f8 (ro)
      /var/run/secrets/openshift.io/serviceaccount from service-certs (rw)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  srv-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  myclaimname
    ReadOnly:   false
  keystore-volume:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  service-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  service-certs
    Optional:    false
  default-token-gn7f8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-gn7f8
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  environment=test
                 nodepurpose=app
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
Events:
  Type     Reason       Age                From                   Message
  ----     ------       ----               ----                   -------
  Normal   Scheduled    28m                default-scheduler      Successfully assigned samplenamespace/mypod to workernode1
  Warning  FailedMount  5m (x10 over 26m)  kubelet, workernode1  Unable to mount volumes for pod "mypod_samplenamespace(04e949e0-f73a-11ea-ae69-0050568c0403)": timeout expired waiting for volumes to attach or mount for pod "samplenamespace"/"mypod". list of unmounted volumes=[service-certs]. list of unattached volumes=[srv-data keystore-volume service-certs default-token-gn7f8]
  Warning  FailedMount  1m (x21 over 28m)  kubelet, workernode1  MountVolume.SetUp failed for volume "service-certs" : secrets "service-certs" not found
  • Check if secret name service-certs is available in the application namespace using below command:
$ oc get secrets -n samplenamespace 
4c553ca-cc37-11ea-9355-0a58ac16051d        Opaque                                7         54d
builder-dockercfg-9gz6x                    kubernetes.io/dockercfg               1         192d
builder-token-4lcqb                        kubernetes.io/service-account-token   4         192d
cache-ausa-cams                            Opaque                                2         53d
datagrid-service-bn8gs-credentials-oye5w   Opaque                                0         54d
default-dockercfg-pfjw4                    kubernetes.io/dockercfg               1         192d
default-token-ng4tx                        kubernetes.io/service-account-token   4         192d
deployer-dockercfg-85jdp                   kubernetes.io/dockercfg               1         192d
deployer-token-bhjpr                       kubernetes.io/service-account-token   4         192d
jenkins-dockercfg-mwqbk                    kubernetes.io/dockercfg               1         25d
jenkins-ssh-keys                           Opaque                                1         25d
jenkins-token-4k2zh                        kubernetes.io/service-account-token   4         25d
jenkins-user-passwords                     Opaque                                1         25d
kibana-dockercfg-xjbg8                     kubernetes.io/dockercfg               1         79d
kibana-token-qs4g6                         kubernetes.io/service-account-token   4         79d
my-pod                                     Opaque                                2         4m
  • The secret named service-certs are not present.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments