Why is Heketi failing to delete a device, showing a "not in failed state" error?

Solution Verified - Updated -

Issue

  • When running the command heketi-cli device delete <device ID> it fails with an unspecified error:

    Error: server did not provide a message (status 500: Internal Server Error)
    
  • Reviewing the logs of the Heketi pod, the message not in a failed state on the device ID being removed, is displayed:

    [heketi] ERROR 2019/01/03 12:33:05 /src/github.com/heketi/heketi/apps/glusterfs/device_entry.go:168: device: 0f24da7e2f8fb8a5fdf6ccdd8c7c7f57 is not in failed state
    
  • Although this message is printed, the underlying volume group has been deleted. So running this command a second time fails while trying to delete this vg again:

    [cmdexec] DEBUG 2019/01/03 12:33:04 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:174: Host: test1.test.com:22 Command: /bin/bash -c 'vgremove -qq vg_0f24da7e2f8fb8a5fdf6ccdd8c7c7f57'
    Result: [cmdexec] DEBUG 2019/01/03 12:33:05 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:174: Host: test1.test.com:22 Command: /bin/bash -c 'pvremove -qq '/dev/sdb5'' 
    Result: [cmdexec] ERROR 2019/01/03 12:33:05 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:170: Failed to run command [/bin/bash -c 'ls /var/lib/heketi/mounts/vg_0f24da7e2f8fb8a5fdf6ccdd8c7c7f57'] on test1.test.com:22: Err[Process exited with status 2]: Stdout []: Stderr [ls: cannot access /var/lib/heketi/mounts/vg_0f24da7e2f8fb8a5fdf6ccdd8c7c7f57: No such file or directory]
    

Environment

  • Openshift Container Storage 3.11.1 or below
  • Heketi 7.x or below

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content