Prometheus serviceaccount missing permissions to monitor services in user-defined namespaces
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
Issue
- Unable to monitor service in the custom created namespace
dummyapp
, wheredummyapp
is the custom created namespace. -
Logs from pod
prometheus-k8s-0
shows permissions error when monitoring services on namespacedummyapp
:$ oc logs -c prometheus prometheus-k8s-0 -n openshift-monitoring | grep "cannot list resource" <...> User "system:serviceaccount:openshift-monitoring:prometheus-k8s" cannot list resource "services" in API group "\" in the namespace "dummyapp" <...>
Note: This could happen on multiple namespaces.
Resolution
There are 3 situations associated with this problem.
Situation 1:
The label openshift.io/cluster-monitoring: "true"
is applied on user-defined namespace because of which prometheus-k8s
pods from openshift-monitoring
tries to scrape metrics from user-defined namespace.
This label openshift.io/cluster-monitoring: "true"
should not be applied on user-defined namespaces as per support considerations
To fix this, remove the label from user-defined namespace and enable user-workload monitoring using this documentation. To remove the label, refer below command:
$ oc label namespace <name-of-namespace> openshift.io/cluster-monitoring-
Situation 2:
Additional user-defined ServiceMonitors are created in the openshift-*
and kube-*
projects.
Additional user-defined ServiceMonitors should not be created in the openshift-*
and kube-*
projects as per support considerations
To fix this, remove the additional user-defined ServiceMonitors from the openshift-*
and kube-*
projects:
$ oc -n openshift-monitoring delete servicemonitor <name-of-servicemonitor>
Situation 3:
The namespace for which the log is streaming is actually a namespace hosting core OpenShift Container Platform components or Red Hat certified component.
If core OpenShift component / Red Hat certified component's namespace is missing the expected roles
or rolebindings
, please open a support case with Red Hat. These resources are expected to be present by default and their absence could be due to a bug in the relevant component.
Workaround
-
Make sure a role exists granting the correct permissions in the namespace:
$ cat role.yaml -- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: prometheus-k8s namespace: <name-of-namespace> rules: - apiGroups: - "" resources: - services - endpoints - pods verbs: - get - list - watch
-
Make sure the rolebinding exists binding the previously created role to the serviceaccount
system:serviceaccount:openshift-monitoring:prometheus-k8s
in the namespace:$ cat rolebinding.yaml -- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: prometheus-k8s namespace: <name-of-namespace> roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: prometheus-k8s subjects: - kind: ServiceAccount name: prometheus-k8s namespace: openshift-monitoring
-
Run the following commands and check the output to validate the role and rolebinding
$ oc get role,rolebinding -n <namespace> | egrep "NAME|prometheus" NAME CREATED AT role.rbac.authorization.k8s.io/prometheus-k8s 2023-07-31T20:44:45Z NAME ROLE AGE rolebinding.rbac.authorization.k8s.io/prometheus-k8s Role/prometheus-k8s 15m $ oc get svc -n dummyapp --as=system:serviceaccount:openshift-monitoring:prometheus-k8s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE httpd ClusterIP 172.30.184.120 <none> 8080/TCP,8443/TCP 20m nginx ClusterIP None <none> 80/TCP 18h
Root Cause
- The label
openshift.io/cluster-monitoring: "true"
is applied on user-defined namespace. - Additional user-defined ServiceMonitors are created in the
openshift-*
andkube-*
projects. - The role resource granting privileges to monitor the namespace is missing, or the rolebinding resource assigned to the role of the serviceaccount
system:serviceaccount:openshift-monitoring:prometheus-k8s
is missing.
Diagnostic Steps
-
Logs from pod
prometheus-k8s-0
in the namespaceopenshift-monitoring
shows the following errors:$ $ oc logs -c prometheus prometheus-k8s-0 -n openshift-monitoring | grep "cannot list resource" <...> ts=2023-03-05T13:23:02.382Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"prometheus-external-monitoring\"" ts=2023-03-05T13:23:14.475Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"kuberhealthy\"" ts=2023-03-05T13:23:18.367Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"dummyapp\"" ts=2023-03-05T13:23:24.338Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:448: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"prometheus-external-monitoring\"" ts=2023-03-05T13:23:37.565Z caller=log.go:168 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:449: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"dummyapp\"" <...>
- The Prometheus serviceaccount
system:serviceaccount:openshift-monitoring:prometheus-k8s
complains about the missing privileges to properly query the namespacedummyapp
resources.
- The Prometheus serviceaccount
-
List the service resources in the namespace
dummyapp
while impersonating the Prometheus serviceaccount:$ oc get svc -n dummyapp --as=system:serviceaccount:openshift-monitoring:prometheus-k8s Error from server (Forbidden): services is forbidden: User "system:serviceaccount:openshift-monitoring:prometheus-k8s" cannot list resource "services" in API group "" in the namespace "dummyapp": RBAC: role.rbac.authorization.k8s.io "prometheus-k8s" not found
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments