Istio sidecar termination issue when running Jobs on OpenShift 4
Environment
- RedHat OpenShift Container Platform (OCP) 4
- RedHat OpenShift Service on AWS
- OpenShift Service Mesh (OSSM) 2
Issue
Cannot get the istio-proxy sidecar to properly terminate.
When the main container finishes the sidecar lives on so the pod never exists and the job does not complete.
Resolution
A termination signal needs to be sent from the container that does the work in Job to the istio-proxy sidecar container so that the Job is completed.
This can be achieved with the command curl -sf -XPOST 127.0.0.1:15000/quitquitquit.
The following Job finishes correctly. Please note that the container image that does the job must include the curl utility:
kind: Job
metadata:
name: does-not-hang
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- name: hello
image: registry.access.redhat.com/rhel7/rhel-tools:7.9-22
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster; sleep 15; curl -sf -XPOST 127.0.0.1:15000/quitquitquit
restartPolicy: Never
As the example above is very simple and the commands complete very quickly, the sleep 15 command has been added to allow the containers to have enough time to start, avoiding the curl command to be sent before the istio-proxy container is ready.
The Job is created, and marked as Completed once both containers have stopped instead of remaining as NotReady:
# oc create -f does-not-hang.yaml
job.batch/does-not-hang created /0.1s
# oc get pods
NAME READY STATUS RESTARTS AGE
does-not-hang-vbsfz 0/2 Completed 0 36s
Root Cause
When the work of the Job is completed the istio-proxy container keeps running, and the Job is never marked as Completed.
With the curl -sf -XPOST 127.0.0.1:15000/quitquitquit command sent to the istio-proxy container all containers within the pod are stopped and the Job is considered as Completed.
Diagnostic Steps
- Create a
Joblike the following
# cat > hangs.yaml << EOF
apiVersion: batch/v1
kind: Job
metadata:
name: hangs
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- name: hello
image: registry.access.redhat.com/rhel7/rhel-tools:7.9-22
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: Never
EOF
# oc create -f hangs.yaml
job.batch/hangs created /0.1s
- Once the work of the container is completed, the
istio-proxycontainer is still running:
# oc get pods
NAME READY STATUS RESTARTS AGE
hangs-krwwb 1/2 NotReady 0 21s
# oc rsh -c hello hangs-krwwb
error: unable to upgrade connection: container not found ("hello") /0.2s
# oc rsh -c istio-proxy hangs-krwwb
sh-4.4$
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments