ACM Thanos Rule Pod crashes with "remove /var/thanos/rule/lock: device or resource busy" error
Issue
-
After recreating the
MultiClusterObservabilityobject, the StatefulSetobservability-thanos-ruleis no longer starting and one Pod is stuck inCrashLoopBackOff:oc get pods observability-thanos-rule-0 -n open-cluster-management-observability NAME READY STATUS RESTARTS AGE observability-thanos-rule-0 1/2 CrashLoopBackOff 6 (2m12s ago) 8m11s -
The Pod fails due to a
remove /var/thanos/rule/lock: device or resource busyerror:oc logs observability-thanos-rule-0 -p Defaulted container "thanos-rule" out of: thanos-rule, configmap-reloader level=warn ts=2023-08-04T08:07:31.397789422Z caller=dir_locker.go:77 component=tsdb msg="A lockfile from a previous execution already existed. It was replaced" file=/var/thanos/rule/lock [..] level=error ts=2023-08-04T08:07:33.048425493Z caller=main.go:135 err="remove /var/thanos/rule/lock: device or resource busy\nremove storage lock files\nmain.runRule\n\t/remote-source/thanos/app/cmd/thanos/rule.go:405\nmain.registerRule.func1\n\t/remote-source/thanos/app/cmd/thanos/rule.go:217\nmain.main\n\t/remote-source/thanos/app/cmd/thanos/main.go:133\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594\npreparing rule command failed\nmain.main\n\t/remote-source/thanos/app/cmd/thanos/main.go:135\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1594"
Environment
- Red Hat Advanced Cluster Management for Kubernetes 2.8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.