Why is Heketi stuck with 8 ongoing operations?
Issue
-
In a converged Openshift Container Storage deployment, persistent volume provisioning is failing with this error message:
Error: Failed to provision volume with StorageClass "glusterfs-storage": failed to create volume: failed to create volume: Server busy. Retry the operation later.
-
Inspecting the reason for the above error, the Heketi pod shows there are 8 in-flight operations ongoing, which is the default Heketi limit:
sh-4.2# heketi-cli server operations list Id:XXXXXXXXXXXXXXXXXXXXXXXX Type:delete-volume Status:New in-flight sh-4.2# heketi-cli server operations info Operation Counts: Total: 1 In-Flight: 8 New: 1 Failed: 0 Stale: 0
-
Normally, once any of these operations finish, new operations will be processed, but in this case, this is not occurring. The above 8 operations are stuck for several hours.
-
How to get dynamic provisioning working again?
Environment
- Openshift Container Storage version 3.11.3 and below
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.