Prometheus serviceaccount missing permissions to monitor services in custom namespaces

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform 4.x

Issue

  • Unable to monitor service in the custom created namespace dummyapp, where dummyapp is the custom created namespace.
  • Logs from pod prometheus-k8s-0 shows permissions error when monitoring services on namespace dummyapp:

    <...> 
    User "system:serviceaccount:openshift-monitoring:prometheus-k8s" cannot list resource "services" in API group "\" in the namespace "dummyapp"
    <...>
    

    Note: This could happen on multiple namespaces.

Resolution

  • Make sure a role exists granting the correct permissions in the namespace dummyapp

    $ cat role.yaml
    --
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: prometheus-k8s
      namespace: dummyapp
    rules:
    - apiGroups:
      - ""
      resources:
      - services
      - endpoints
      - pods
      verbs:
      - get
      - list
      - watch
    
  • Make sure the rolebinding exists binding the previously created role to the serviceaccount system:serviceaccount:openshift-monitoring:prometheus-k8s in the namespace dummyapp

    $ cat rolebinding.yaml
    --
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: prometheus-k8s
      namespace: dummyapp 
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: prometheus-k8s
    subjects:
    - kind: ServiceAccount
      name: prometheus-k8s
      namespace: openshift-monitoring
    
  • Run the following commands and check the output to validate the role and rolebinding

    $ oc get role,rolebinding -n dummyapp | egrep "NAME|prometheus"
    NAME                                            CREATED AT
    role.rbac.authorization.k8s.io/prometheus-k8s   2023-07-31T20:44:45Z
    NAME                                                          ROLE                               AGE
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s          Role/prometheus-k8s                15m
    
    $ oc get svc -n dummyapp --as=system:serviceaccount:openshift-monitoring:prometheus-k8s
    NAME    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
    httpd   ClusterIP   172.30.184.120   <none>        8080/TCP,8443/TCP   20m
    nginx   ClusterIP   None             <none>        80/TCP              18h
    

Root Cause

  • The role resource granting privileges to monitor the namespace dummyapp is missing.
  • The rolebinding resource assigned to the role of the serviceaccount system:serviceaccount:openshift-monitoring:prometheus-k8s is missing.

Diagnostic Steps

  • Logs from pod prometheus-k8s-0 in the namespace openshift-monitoring shows the following errors:

    $ oc logs prometheus-k8s-0 -n openshift-monitoring
    <...>
    ts=2023-03-05T13:23:02.382Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"prometheus-external-monitoring\""
    ts=2023-03-05T13:23:14.475Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"kuberhealthy\""
    ts=2023-03-05T13:23:18.367Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"dummyapp\""
    ts=2023-03-05T13:23:24.338Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"prometheus-external-monitoring\""
    ts=2023-03-05T13:23:37.565Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:449: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"dummyapp\""
    <...>
    
    • The Prometheus serviceaccount system:serviceaccount:openshift-monitoring:prometheus-k8s complains about the missing privileges to properly query the namespace dummyapp resources.
  • List the service resources in the namespace dummyapp while impersonating the Prometheus serviceaccount:

    $ oc get svc -n dummyapp --as=system:serviceaccount:openshift-monitoring:prometheus-k8s
    Error from server (Forbidden): services is forbidden: User "system:serviceaccount:openshift-monitoring:prometheus-k8s" cannot list resource "services" in API group "" in the namespace "dummyapp": RBAC: role.rbac.authorization.k8s.io "prometheus-k8s" not found
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments