Upgrade OCP to 3.7 breaks service catalog etcd entries
Issue
Upgraded control plane first then the nodes, rebooted all servers and now I am unable to upgrade the service catalog
Playbook fails:
TASK [openshift_service_catalog : wait for api server to be ready] ***************************************************************************************************************************
FAILED - RETRYING: wait for api server to be ready (120 retries left).
FAILED - RETRYING: wait for api server to be ready (119 retries left).
FAILED - RETRYING: wait for api server to be ready (118 retries left).
FAILED - RETRYING: wait for api server to be ready (117 retries left).
FAILED - RETRYING: wait for api server to be ready (116 retries left).
...
FAILED - RETRYING: wait for api server to be ready (5 retries left).
FAILED - RETRYING: wait for api server to be ready (4 retries left).
FAILED - RETRYING: wait for api server to be ready (3 retries left).
FAILED - RETRYING: wait for api server to be ready (2 retries left).
FAILED - RETRYING: wait for api server to be ready (1 retries left).
fatal: [ip-10-53-3-151.ec2.internal]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "-k", "https://apiserver.kube-service-catalog.svc/healthz"], "delta": "0:00:00.063268", "end": "2018-01-12 18:42:31.270849", "rc": 0, "start": "2018-01-12 18:42:31.207581", "stderr": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 180 100 180 0 0 3136 0 --:--:-- --:--:-- --:--:-- 3157", "stderr_lines": [" % Total % Received % Xferd Average Speed Time Time Time Current", " Dload Upload Total Spent Left Speed", "", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0", "100 180 100 180 0 0 3136 0 --:--:-- --:--:-- --:--:-- 3157"], "stdout": "[+]ping ok\n[+]poststarthook/generic-apiserver-start-informers ok\n[+]poststarthook/start-service-catalog-apiserver-informers ok\n[-]etcd failed: reason withheld\nhealthz check failed", "stdout_lines": ["[+]ping ok", "[+]poststarthook/generic-apiserver-start-informers ok", "[+]poststarthook/start-service-catalog-apiserver-informers ok", "[-]etcd failed: reason withheld", "healthz check failed"]}
to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/service-catalog.retry
PLAY RECAP ***********************************************************************************************************************************************************************************
ip-10-53-0-226.ec2.internal : ok=28 changed=2 unreachable=0 failed=0
ip-10-53-1-133.ec2.internal : ok=43 changed=2 unreachable=0 failed=0
ip-10-53-1-16.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-1-99.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-3-151.ec2.internal : ok=79 changed=18 unreachable=0 failed=1
ip-10-53-3-178.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-3-240.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-4-127.ec2.internal : ok=43 changed=2 unreachable=0 failed=0
ip-10-53-4-221.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
ip-10-53-4-84.ec2.internal : ok=42 changed=2 unreachable=0 failed=0
localhost : ok=12 changed=0 unreachable=0 failed=0
INSTALLER STATUS *****************************************************************************************************************************************************************************
Initialization : Complete
Service Catalog Install : In Progress
This phase can be restarted by running: playbooks/byo/openshift-cluster/service-catalog.yml
- Manually curling the URL gives:
- ping ok
- poststarthook/generic-apiserver-start-informers ok
- poststarthook/start-service-catalog-apiserver-informers ok
- etcd failed: reason withheld
- healthz check failed
Checked and verified the OAB ETCD container is up but found the following error in the logs:
2018-01-12 18:13:38.544464 I | etcdserver/api/v3rpc: Failed to dial 0.0.0.0:2379: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
Environment
- OpenShift Container Platform
- 3.7.9
- 3.7.14
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
