Upgrade OCP to 3.7 breaks service catalog etcd entries
Issue
Upgraded control plane first then the nodes, rebooted all servers and now I am unable to upgrade the service catalog
Playbook fails:
TASK [openshift_service_catalog : wait for api server to be ready] ***************************************************************************************************************************
FAILED - RETRYING: wait for api server to be ready (120 retries left).
FAILED - RETRYING: wait for api server to be ready (119 retries left).
FAILED - RETRYING: wait for api server to be ready (118 retries left).
FAILED - RETRYING: wait for api server to be ready (117 retries left).
FAILED - RETRYING: wait for api server to be ready (116 retries left).
...
FAILED - RETRYING: wait for api server to be ready (5 retries left).
FAILED - RETRYING: wait for api server to be ready (4 retries left).
FAILED - RETRYING: wait for api server to be ready (3 retries left).
FAILED - RETRYING: wait for api server to be ready (2 retries left).
FAILED - RETRYING: wait for api server to be ready (1 retries left).
fatal: [ip-10-53-3-151.ec2.internal]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "-k", "https://apiserver.kube-service-catalog.svc/healthz"], "delta": "0:00:00.063268", "end": "2018-01-12 18:42:31.270849", "rc": 0, "start": "2018-01-12 18:42:31.207581", "stderr": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 180 100 180 0 0 3136 0 --:--:-- --:--:-- --:--:-- 3157", "stderr_lines": [" % Total % Received % Xferd Average Speed Time Time Time Current", " Dload Upload Total Spent Left Speed", "", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0", "100 180 100 180 0 0 3136 0 --:--:-- --:--:-- --:--:-- 3157"], "stdout": "[+]ping ok\n[+]poststarthook/generic-apiserver-start-informers ok\n[+]poststarthook/start-service-catalog-apiserver-informers ok\n[-]etcd failed: reason withheld\nhealthz check failed", "stdout_lines": ["[+]ping ok", "[+]poststarthook/generic-apiserver-start-informers ok", "[+]poststarthook/start-service-catalog-apiserver-informers ok", "[-]etcd failed: reason withheld", "healthz check failed"]}
to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/service-catalog.retry
PLAY RECAP ***********************************************************************************************************************************************************************************
ip-10-53-0-226.ec2.internal : ok=28 changed=2 unreachable=0 failed=0
ip-10-53-1-133.ec2.internal : ok=43 changed=2 unreachable=0 failed=0
ip-10-53-1-16.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-1-99.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-3-151.ec2.internal : ok=79 changed=18 unreachable=0 failed=1
ip-10-53-3-178.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-3-240.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-4-127.ec2.internal : ok=43 changed=2 unreachable=0 failed=0
ip-10-53-4-221.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-4-84.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
localhost : ok=12 changed=0 unreachable=0 failed=0
INSTALLER STATUS *****************************************************************************************************************************************************************************
Initialization : Complete
Service Catalog Install : In Progress
This phase can be restarted by running: playbooks/byo/openshift-cluster/service-catalog.yml
- Manually curling the URL gives:
- ping ok
- poststarthook/generic-apiserver-start-informers ok
- poststarthook/start-service-catalog-apiserver-informers ok
- etcd failed: reason withheld
- healthz check failed
Checked and verified the OAB ETCD container is up but found the following error in the logs:
2018-01-12 18:13:38.544464 I | etcdserver/api/v3rpc: Failed to dial 0.0.0.0:2379: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
Environment
- OpenShift Container Platform
- 3.7.9
- 3.7.14
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.