In RHOCP 3.11 pods are in "ContainerCreating" state
Environment
- Red Hat OpenShift Container Platform
- 3.11
Issue
- Following an RHOCP
upgrade, most of the application pods from projectsamplenamespacewere inContainerCreatingstate.
Resolution
Statefulsetscan't be edited. Hence, create a new statefulset withdesired secretand deploy it. Once it is successfully deployed andnewpods are created, remove theolderpods. Use below steps for reference:
$ oc get sts mypod -oyaml > mypodbackup.yaml
$ vim mypodbackup.yaml
-------------8<-------------
specs:
volumes:
- name: mysql
secret:
secretName: mysql
------------->8-------------
- Using the above
yamlfile create thenew statefulsetas follows:
$ oc create -f mypodbackup.yaml
- Check the status of
new podscreated by thenew statefulsetusing below command:
$ oc get pods
NAME READY STATUS RESTARTS AGE
mypod 0/1 ContainerCreating 0 28m
mypod-1 1/1 Running 0 28m
Deleteold application pods using below command:
$ oc delete pods mypod
Root Cause
- This is a
configuration issueinstatefulsetused by the application pod. It was referring toincorrect secretcalledservice-certs. Correct secrets weremy-podopaque secret.
Diagnostic Steps
- Check pod
statususing below command:
$ oc get pods
NAME READY STATUS RESTARTS AGE
mypod 0/1 ContainerCreating 0 28m
- Check the pod
descriptionusing below command:
$ oc describe pod mypod
Name: mypod
Namespace: samplenamespace
Priority: 0
PriorityClassName: <none>
Node: workernode1/10.129.x.x
Start Time: Tue, 15 Sep 2020 05:58:35 -0400
Labels: application=mypod
controller-revision-hash=mypod-67f58bf7f
deploymentConfig=mypod
statefulset.kubernetes.io/pod-name=mypod
Annotations: kubernetes.io/limit-ranger=LimitRanger plugin set: cpu limit for container mypod
openshift.io/scc=restricted
Status: Pending
IP:
Controlled By: StatefulSet/mypod
Containers:
mypod:
Container ID:
Image: registry.redhat.io/jboss-datagrid-7/datagrid73-openshift
Image ID:
Ports: 8443/TCP, 8888/TCP, 11222/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 1
memory: 512Mi
Requests:
cpu: 500m
memory: 512Mi
Liveness: exec [/opt/datagrid/bin/livenessProbe.sh] delay=15s timeout=10s period=20s #success=1 #failure=5
Readiness: exec [/opt/datagrid/bin/readinessProbe.sh] delay=17s timeout=10s period=10s #success=1 #failure=5
Environment:
SERVICE_NAME: mypod
SERVICE_PROFILE: datagrid-service
JGROUPS_PING_PROTOCOL: openshift.DNS_PING
OPENSHIFT_DNS_PING_SERVICE_NAME: mypod-ping
USERNAME: <set to the key 'application-user' in secret 'mypod'> Optional: false
PASSWORD: <set to the key 'application-password' in secret 'mypod'> Optional: false
Mounts:
/opt/datagrid/standalone/data from srv-data (rw)
/var/run/secrets/java.io/keystores from keystore-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gn7f8 (ro)
/var/run/secrets/openshift.io/serviceaccount from service-certs (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
srv-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: myclaimname
ReadOnly: false
keystore-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
service-certs:
Type: Secret (a volume populated by a Secret)
SecretName: service-certs
Optional: false
default-token-gn7f8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-gn7f8
Optional: false
QoS Class: Burstable
Node-Selectors: environment=test
nodepurpose=app
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 28m default-scheduler Successfully assigned samplenamespace/mypod to workernode1
Warning FailedMount 5m (x10 over 26m) kubelet, workernode1 Unable to mount volumes for pod "mypod_samplenamespace(04e949e0-f73a-11ea-ae69-0050568c0403)": timeout expired waiting for volumes to attach or mount for pod "samplenamespace"/"mypod". list of unmounted volumes=[service-certs]. list of unattached volumes=[srv-data keystore-volume service-certs default-token-gn7f8]
Warning FailedMount 1m (x21 over 28m) kubelet, workernode1 MountVolume.SetUp failed for volume "service-certs" : secrets "service-certs" not found
- Check if secret name
service-certsis available in theapplication namespaceusing below command:
$ oc get secrets -n samplenamespace
4c553ca-cc37-11ea-9355-0a58ac16051d Opaque 7 54d
builder-dockercfg-9gz6x kubernetes.io/dockercfg 1 192d
builder-token-4lcqb kubernetes.io/service-account-token 4 192d
cache-ausa-cams Opaque 2 53d
datagrid-service-bn8gs-credentials-oye5w Opaque 0 54d
default-dockercfg-pfjw4 kubernetes.io/dockercfg 1 192d
default-token-ng4tx kubernetes.io/service-account-token 4 192d
deployer-dockercfg-85jdp kubernetes.io/dockercfg 1 192d
deployer-token-bhjpr kubernetes.io/service-account-token 4 192d
jenkins-dockercfg-mwqbk kubernetes.io/dockercfg 1 25d
jenkins-ssh-keys Opaque 1 25d
jenkins-token-4k2zh kubernetes.io/service-account-token 4 25d
jenkins-user-passwords Opaque 1 25d
kibana-dockercfg-xjbg8 kubernetes.io/dockercfg 1 79d
kibana-token-qs4g6 kubernetes.io/service-account-token 4 79d
my-pod Opaque 2 4m
- The secret named
service-certsare not present.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments