Why is Heketi failing to delete a device, showing a "not in failed state" error?
Issue
-
When running the command
heketi-cli device delete <device ID>
it fails with an unspecified error:Error: server did not provide a message (status 500: Internal Server Error)
-
Reviewing the logs of the Heketi pod, the message
not in a failed state
on the device ID being removed, is displayed:[heketi] ERROR 2019/01/03 12:33:05 /src/github.com/heketi/heketi/apps/glusterfs/device_entry.go:168: device: 0f24da7e2f8fb8a5fdf6ccdd8c7c7f57 is not in failed state
-
Although this message is printed, the underlying volume group has been deleted. So running this command a second time fails while trying to delete this vg again:
[cmdexec] DEBUG 2019/01/03 12:33:04 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:174: Host: test1.test.com:22 Command: /bin/bash -c 'vgremove -qq vg_0f24da7e2f8fb8a5fdf6ccdd8c7c7f57' Result: [cmdexec] DEBUG 2019/01/03 12:33:05 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:174: Host: test1.test.com:22 Command: /bin/bash -c 'pvremove -qq '/dev/sdb5'' Result: [cmdexec] ERROR 2019/01/03 12:33:05 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:170: Failed to run command [/bin/bash -c 'ls /var/lib/heketi/mounts/vg_0f24da7e2f8fb8a5fdf6ccdd8c7c7f57'] on test1.test.com:22: Err[Process exited with status 2]: Stdout []: Stderr [ls: cannot access /var/lib/heketi/mounts/vg_0f24da7e2f8fb8a5fdf6ccdd8c7c7f57: No such file or directory]
Environment
- Openshift Container Storage 3.11.1 or below
- Heketi 7.x or below
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.