v3.11 Configuring Clusters 3.2.22. Exposing Router Metrics
I opened a case https://access.redhat.com/support/cases/#/case/02311827 to find out why haproxy router stats were not available in html format.
I believe the v3.11 ose-haproxy-router container has a bug and the documentation is inaccurate.
The documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/3.11/html-single/configuring_clusters/#exposing-the-router-metrics
and
https://docs.openshift.com/container-platform/3.11/install_config/router/default_haproxy_router.html#exposing-the-router-metrics
says to remove the environment variables from the router deploymentconfig:
- name: ROUTER_LISTEN_ADDR
value: 0.0.0.0:1936
- name: ROUTER_METRICS_TYPE
value: haproxy
A snippit of the default router deploymentconfig is:
spec:
containers:
- env:
- name: ROUTER_LISTEN_ADDR
value: 0.0.0.0:1936
- name: ROUTER_METRICS_TYPE
value: haproxy
image: registry.redhat.io/openshift3/ose-haproxy-router:v3.11
livenessProbe:
failureThreshold: 3
httpGet:
host: localhost
path: /healthz
port: 1936
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: router
ports:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
- containerPort: 1936
hostPort: 1936
name: stats
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
host: localhost
path: healthz/ready
port: 1936
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
I discovered that when you remove the environment variable from the router deploymentconfig:
- name: ROUTER_METRICS_TYPE
value: haproxy
This changes the router container to reply with html and the readinessProbe at path: healthz/ready does not reply with ok. It replies with the full stats html window(and it's access requires username/password). Because it fails to reply with ok, the router deployment will fail after 10 minutes and roll back to the previous router dc.
I changed the readinessProbe path to the same as the livenessProbe and used path: /healthz for BOTH livenessProbe & readinessProbe.
By making these two edits to the router deploymentconfig I was able to successfully redeploy a new set of routers that
I made my 2 changes to the default config and a router node now replies with:
[root@vmlxopencd01 ~]# curl vmlxopencd06.osb.spectrum-health.org:1936/healthz/ready
401 Unauthorized
You need a valid user and password to access this content.
That is not the expected reply from the default readinessProbe path.
[root@vmlxopencd01 ~]# curl vmlxopencd06.osb.spectrum-health.org:1936/healthz
200 OK
Service ready.
That is the expected reply from the livenessProbe path.
If I provide the admin:password to the default readinessProbe path it replies with the full haproxy html stats page:
[root@vmlxopencd01 ~]# curl admin:NmtUmwV7gq@vmlxopencd06.osb.spectrum-health.org:1936/healthz/ready
Statistics Report for HAProxy
I think the container is replying incorrectly to the default readinessProbe path when the environment variable ROUTER_METRICS_TYPE is not set to value: haproxy
-Paul VanAllsburg