OSE 3.4: docker-registry deployment fails - nable to mount volumes for pod

Posted on

I've got a brand new OSE 3.4.1.10 install in AWS - the default docker registry won't deploy, the POD is never created and my OSE is unusable. II have a single node OSE 3.4 that runs fine, but this 3 node, single master not so much.

Anyone have any ideas? SOlutions? Etc.?
Also - I wanted to install OSE 3.3, I even have that as the only repo for openshift yet 3.4.1.10 was pulled down anyway. As an aside if you know how to get 3.3 I'm all ears while we fight with 3.4.
thanks!

[provide a description of the issue]
Brand new OpenShift 3.4 install in AWS in RHEL 7
3 Nodes, 1 master, 2 nodes (not masters)
Go through the installation no problems or errors
Setup HTPasswd Auth
Finish the install, grant access to my users (cluster admins)
Login to the OSE Web interface
See that the "Docker Registry" Deployment has failed (a few times)
Check ='Monitoring >> View Details':
Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "docker-registry-2-7snl7"/"default". list of unattached/unmounted volumes=[registry-storage]
Then one of these:
Unable to mount volumes for pod "docker-registry-2-7snl7_default(19fcf083-164c-11e7-b28c-02eccfcd045f)": timeout expired waiting for volumes to attach/mount for pod "docker-registry-2-7snl7"/"default". list of unattached/unmounted volumes=[registry-storage]
then two of these:
MountVolume.SetUp failed for volume "kubernetes.io/nfs/19fcd831-164c-11e7-b28c-02eccfcd045f-registry-volume" (spec.Name: "registry-volume") pod "19fcd831-164c-11e7-b28c-02eccfcd045f" (UID: "19fcd831-164c-11e7-b28c-02eccfcd045f") with: exit status 32

What's really annoying is I stood up a 1 node OSE 3.4 in AWS and its working fine.

If someone wants to help debug this I will give you access to the openshift web console.

If you need logs, please tell me where these logs are, I'm not an OpenShift/Docker/or Kubernetes expert.
Version

[provide output of the openshift version or oc version command]
[root@ip-172-31-13-117 master]# oc version
oc v3.4.1.10
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-31-13-117.us-east-2.compute.internal:8443
openshift v3.4.1.10
kubernetes v1.4.0+776c994
Steps To Reproduce

Brand new OpenShift 3.4 install in AWS in RHEL 7
3 Nodes, 1 master, 2 nodes (not masters)
Go through the installation no problems or errors
Setup HTPasswd Auth
Finish the install, grant access to my users (cluster admins)
Login to the OSE Web interface
See that the "Docker Registry" Deployment has failed (a few times)
Check ='Monitoring >> View Details':
Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "docker-registry-2-7snl7"/"default". list of unattached/unmounted volumes=[registry-storage]
Then one of these:
Unable to mount volumes for pod "docker-registry-2-7snl7_default(19fcf083-164c-11e7-b28c-02eccfcd045f)": timeout expired waiting for volumes to attach/mount for pod "docker-registry-2-7snl7"/"default". list of unattached/unmounted volumes=[registry-storage]
then two of these:
MountVolume.SetUp failed for volume "kubernetes.io/nfs/19fcd831-164c-11e7-b28c-02eccfcd045f-registry-volume" (spec.Name: "registry-volume") pod "19fcd831-164c-11e7-b28c-02eccfcd045f" (UID: "19fcd831-164c-11e7-b28c-02eccfcd045f") with: exit status 32
Current Result

Docker-registry deployment fails and the Docker-registry POD never spins up
Openshift can't be used
Expected Result

Everything in the default project deploys, POD starts up, no problems, no errors, OPenSHift can be used
Additional Information

[try to run $ oadm diagnostics or oc adm diagnostics command if possible]
[if you are reporting issue related to builds, provide build logs with BUILD_LOGLEVEL=5]
[consider attaching output of the $ oc get all -o json -n command to the issue]
[visit https://docs.openshift.org/latest/welcome/index.html]

[root@ip-172-31-13-117 master]# oc adm diagnostics
[Note] Determining if client configuration exists for client/cluster diagnostics
Info: Successfully read a client config file at '/root/.kube/config'
Info: Using context for cluster-admin access: 'default/ip-172-31-13-117-us-east-2-compute-internal:8443/shepp'
[Note] Performing systemd discovery

[Note] Running diagnostic: ConfigContexts[default/ec2-52-15-109-239-us-east-2-compute-amazonaws-com:8443/system:admin]
Description: Validate client config context is complete and has connectivity

Info: For client config context 'default/ec2-52-15-109-239-us-east-2-compute-amazonaws-com:8443/system:admin':
The server URL is 'https://ec2-52-15-109-239.us-east-2.compute.amazonaws.com:8443'
The user authentication is 'system:admin/ip-172-31-13-117-us-east-2-compute-internal:8443'
The current project is 'default'
Successfully requested project list; has access to project(s):
[default kube-system logging management-infra openshift openshift-infra]

[Note] Running diagnostic: ConfigContexts[default/ip-172-31-13-117-us-east-2-compute-internal:8443/shepp]
Description: Validate client config context is complete and has connectivity

Info: The current client config context is 'default/ip-172-31-13-117-us-east-2-compute-internal:8443/shepp':
The server URL is 'https://ip-172-31-13-117.us-east-2.compute.internal:8443'
The user authentication is 'shepp/ip-172-31-13-117-us-east-2-compute-internal:8443'
The current project is 'default'
Successfully requested project list; has access to project(s):
[management-infra openshift openshift-infra default kube-system logging]

[Note] Running diagnostic: ConfigContexts[default/ip-172-31-13-117-us-east-2-compute-internal:8443/system:admin]
Description: Validate client config context is complete and has connectivity

Info: For client config context 'default/ip-172-31-13-117-us-east-2-compute-internal:8443/system:admin':
The server URL is 'https://ip-172-31-13-117.us-east-2.compute.internal:8443'
The user authentication is 'system:admin/ip-172-31-13-117-us-east-2-compute-internal:8443'
The current project is 'default'
Successfully requested project list; has access to project(s):
[default kube-system logging management-infra openshift openshift-infra]

[Note] Running diagnostic: DiagnosticPod
Description: Create a pod to run diagnostics from the application standpoint
(the diagnotic POD is failing to spin up)...

[root@ip-172-31-13-117 master]# oc get -o json pod docker-registry-2-deploy
{
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name": "docker-registry-2-deploy",
"namespace": "default",
"selfLink": "/api/v1/namespaces/default/pods/docker-registry-2-deploy",
"uid": "18b29f09-164c-11e7-b28c-02eccfcd045f",
"resourceVersion": "1557",
"creationTimestamp": "2017-03-31T19:56:14Z",
"labels": {
"openshift.io/deployer-pod-for.name": "docker-registry-2"
},
"annotations": {
"openshift.io/deployment.name": "docker-registry-2",
"openshift.io/scc": "restricted"
}
},
"spec": {
"volumes": [
{
"name": "deployer-token-j4z7u",
"secret": {
"secretName": "deployer-token-j4z7u",
"defaultMode": 420
}
}
],
"containers": [
{
"name": "deployment",
"image": "openshift3/ose-deployer:v3.4.1.10",
"env": [
{
"name": "KUBERNETES_MASTER",
"value": "https://ip-172-31-13-117.us-east-2.compute.internal:8443"
},
{
"name": "OPENSHIFT_MASTER",
"value": "https://ip-172-31-13-117.us-east-2.compute.internal:8443"
},
{
"name": "BEARER_TOKEN_FILE",
"value": "/var/run/secrets/kubernetes.io/serviceaccount/token"
},
{
"name": "OPENSHIFT_CA_DATA",
"value": "-----BEGIN CERTIFICATE-----\nMIIC6jCCAdKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAmMSQwIgYDVQQDDBtvcGVu\nc2hpZnQtc2lnbmVyQDE0OTA5ODk3NDUwHhcNMTcwMzMxMTk0OTA0WhcNMjIwMzMw\nMTk0OTA1WjAmMSQwIgYDVQQDDBtvcGVuc2hpZnQtc2lnbmVyQDE0OTA5ODk3NDUw\nggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDlsdifxxb4I0o73CgJn/nR\nV+80Ukn7UIi039fsn7bo8yaHjwGPTrQjZ6UUb2w2EB/4/yAXmOkYbTutU89yQT02\nrNwvJdVci1H6lIro1xukw/UUNCvseZfkVA6RyMqjqBlQXh72R/m1gw7MBHXUlman\nr+23qyzKZmK0Du+N0UFwtoRxDlKb+UynqWa/aRa5aGjybiOPoXem5bBq6vi74ruc\nnjAysjHYNEE+qJU4c15mpM7iJg5tXLbYVR6714nbt9W8Jp4Z03CEsMnn8duLGIVa\ndrCbc1yTQV4GYM/7gtJEKO6tmFXjF4CMg7ZxDg/kTSxbAQuFWX3YEndJY3aZlOa/\nAgMBAAGjIzAhMA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MA0GCSqG\nSIb3DQEBCwUAA4IBAQBI45VbH5jtHL6ORnKYxfdJ7BospgV9981L7d5XMamRIyok\nU645GT2giB7Yn7qNw8cIMBelOdggIx8atd8c512GBmv9KT5dgBg+w9wfsHdzFA62\n5/548KQ8iI6LsPweLCoMWypUyU/T9IxOLIfE9/6UH0Gl0tVvHLA6SjzdlYxLz8sA\nh6YRzWHFMsT3wgAI8tdoa6RZi4QGClnSEWR/8Vm4i+WQFRxVR2Uasfl3F7kfvSYG\nQfkAVGsTf8yrAoBovlDfjlVrLGaEYgjtziCHAAQzisvypBVXHh9NBd5Huyn6Q+Mr\nstSYDj2lPVg4+xAJUaohzdiyTIZCCx4Hyg59ZXjs\n-----END CERTIFICATE-----\n"
},
{
"name": "OPENSHIFT_DEPLOYMENT_NAME",
"value": "docker-registry-2"
},
{
"name": "OPENSHIFT_DEPLOYMENT_NAMESPACE",
"value": "default"
}
],
"resources": {},
"volumeMounts": [
{
"name": "deployer-token-j4z7u",
"readOnly": true,
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
}
],
"terminationMessagePath": "/dev/termination-log",
"imagePullPolicy": "IfNotPresent",
"securityContext": {
"capabilities": {
"drop": [
"KILL",
"MKNOD",
"SETGID",
"SETUID",
"SYS_CHROOT"
]
},
"privileged": false,
"seLinuxOptions": {
"level": "s0:c1,c0"
},
"runAsUser": 1000000000
}
}
],
"restartPolicy": "Never",
"terminationGracePeriodSeconds": 10,
"activeDeadlineSeconds": 21600,
"dnsPolicy": "ClusterFirst",
"nodeSelector": {
"region": "infra"
},
"serviceAccountName": "deployer",
"serviceAccount": "deployer",
"nodeName": "ip-172-31-8-164.us-east-2.compute.internal",
"securityContext": {
"seLinuxOptions": {
"level": "s0:c1,c0"
},
"fsGroup": 1000000000
},
"imagePullSecrets": [
{
"name": "deployer-dockercfg-2km6x"
}
]
},
"status": {
"phase": "Failed",
"conditions": [
{
"type": "Initialized",
"status": "True",
"lastProbeTime": null,
"lastTransitionTime": "2017-03-31T19:56:14Z"
},
{
"type": "Ready",
"status": "False",
"lastProbeTime": null,
"lastTransitionTime": "2017-03-31T20:06:17Z",
"reason": "ContainersNotReady",
"message": "containers with unready status: [deployment]"
},
{
"type": "PodScheduled",
"status": "True",
"lastProbeTime": null,
"lastTransitionTime": "2017-03-31T19:56:14Z"
}
],
"hostIP": "172.31.8.164",
"podIP": "10.1.2.4",
"startTime": "2017-03-31T19:56:14Z",
"containerStatuses": [
{
"name": "deployment",
"state": {
"terminated": {
"exitCode": 1,
"reason": "Error",
"startedAt": "2017-03-31T19:56:16Z",
"finishedAt": "2017-03-31T20:06:16Z",
"containerID": "docker://c62c81d058d0a9051f4754731123246d5587b18a0b2e511c6f2f6eb99fe9609d"
}
},
"lastState": {},
"ready": false,
"restartCount": 0,
"image": "openshift3/ose-deployer:v3.4.1.10",
"imageID": "docker-pullable://registry.access.redhat.com/openshift3/ose-deployer@sha256:5488cb52b4fa8cc8620c74c0b3e62ef6e5f07ce335e2cea3952d0837e21fd70f",
"containerID": "docker://c62c81d058d0a9051f4754731123246d5587b18a0b2e511c6f2f6eb99fe9609d"
}
]
}
}
[root@ip-172-31-13-117 master]#

Responses