The user-defined Prometheus pods are unable to create the logs in mount hostpath with direstory's permission to 755

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform
    • 4.10+
  • prometheus operator
    • 2.32.1

Issue

After installing prometheus via enabling monitoring and using hostpath with direstory's permission to 755 as a persistent volume, one can see the following message issued by prometheus-user-workload pods:

$ oc get po -o wide
NAME                                  READY   STATUS             RESTARTS      AGE   IP             NODE                       NOMINATED NODE   READINESS GATES
prometheus-operator-b55fdf657-ljdnq   2/2     Running            0             74s   10.128.0.129   master01.ocp4.danliu.com   <none>           <none>
prometheus-user-workload-0            4/5     CrashLoopBackOff   2 (13s ago)   46s   10.131.0.58    worker02.ocp4.danliu.com   <none>           <none>
prometheus-user-workload-1            4/5     CrashLoopBackOff   2 (17s ago)   46s   10.128.2.161   worker03.ocp4.danliu.com   <none>           <none>
thanos-ruler-user-workload-0          3/3     Running            0             66s   10.128.2.160   worker03.ocp4.danliu.com   <none>           <none>
thanos-ruler-user-workload-1          3/3     Running            0             66s   10.129.3.28    worker01.ocp4.danliu.com   <none>           <none>

$ oc get po prometheus-user-workload-0 -o yaml
        message: "ts=2023-06-08T05:38:49.062Z caller=main.go:532 level=info msg=\"Starting
          opening query log file\" file=/prometheus/queries.active err=\"open /prometheus/queries.active:
          permission denied\"\npanic: Unable to create mmap-ed active query log\n\ngoroutine"
        reason: Error

Resolution

    1. ssh into hostpath nodes and set UID and GID to 65534 in the hostpath directory prometheus-db.
    1. Delete pods that are working abnormally.
$ ls -lZd /mnt/prometheus-data/prometheus-db/
drwxr-xr-x. 2 root root system_u:object_r:container_file_t:s0 6 Jun  8 06:12 /mnt/prometheus-data/prometheus-db/
$ chown -R 65534:65534 /mnt/prometheus-data/prometheus-db/
$ ls -lZd /mnt/prometheus-data/prometheus-db/
drwxr-xr-x. 2 nfsnobody nfsnobody system_u:object_r:container_file_t:s0 6 Jun  8 06:12 /mnt/prometheus-data/prometheus-db/
$ oc get po -o wide
NAME                                  READY   STATUS             RESTARTS      AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
prometheus-operator-b55fdf657-ljdnq   2/2     Running            0             2m58s   10.128.0.129   master01.ocp4.danliu.com   <none>           <none>
prometheus-user-workload-0            5/5     Running            4 (89s ago)   2m30s   10.131.0.58    worker02.ocp4.danliu.com   <none>           <none>
prometheus-user-workload-1            4/5     CrashLoopBackOff   4 (47s ago)   2m30s   10.128.2.161   worker03.ocp4.danliu.com   <none>           <none>
thanos-ruler-user-workload-0          3/3     Running            0             2m50s   10.128.2.160   worker03.ocp4.danliu.com   <none>           <none>
thanos-ruler-user-workload-1          3/3     Running            0             2m50s   10.129.3.28    worker01.ocp4.danliu.com   <none>           <none>

$ oc delete po prometheus-user-workload-1
$ oc get po
NAME                                  READY   STATUS    RESTARTS       AGE
prometheus-operator-b55fdf657-ljdnq   2/2     Running   0              3m26s
prometheus-user-workload-0            5/5     Running   4 (117s ago)   2m58s
prometheus-user-workload-1            5/5     Running   0              16s
thanos-ruler-user-workload-0          3/3     Running   0              3m18s
thanos-ruler-user-workload-1          3/3     Running   0              3m18s

Root Cause

  • The correct UID and GID as 65534 should be set on the host path directory prometheus-db created by the prometheus-user-workload pod.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments