How can I diagnose fence_vmware_soap failures in RHEL 5, 6, 7, 8 or 9?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 5, 6, 7, 8 or 9 with the High Availability Add On
  • fence_vmware_soap from a version of cman or fence-agents that includes it

Issue

  • fence_vmware_soap fails to fence a node, but I only get "error from agent" in the logs. How can I determine why its failing?

Resolution

The Resolution and Root Cause to fence_vmware_soap failures depends on the specific cause. There are various other solutions covering specific resolutions to different problems with this agent, but this solution deals with diagnosing general failures for the purposes of identifying the proper solution. See the Diagnostic Steps below.

As a general measure:

  • Ensure the username and password are correct. If the failure in question is happening when calling fence_vmware_soap from the command line, check if there are any special characters that can be interpreted by the shell being used, such as bash. If there are, make sure to escape these characters with '\' when calling the agent on the command line, but do not escape them in the configuration.

  • If any character in the username or password is a special XML character, then replace that character with the encoded version. For example, if the password were 'Password@vCenter', use 'Password@vCenter'. See Wikipedia list of XML and HTML entities.

Diagnostic Steps

  • Make sure that the VMware environment meets the requirements of fence_vmware_soap.
  • ping the vCenter hostname or IP being passed to the fence_vmware_soap agent from each node and make sure they get responses.
  • List the virtual machines that the vCenter host is managing to verify that the fence_vmware_soap agent is able to communicate with the vCenter host. If this fails, then there may be a problem with the parameters passed (for example: hostname, username, password) or with the availability or connectivity of the vCenter host.
  • If time command is used in the diagnostic commands below, it will allow us to see how long the connection took. This might be an important piece of information regarding monitor timeouts
# time fence_vmware_soap -o list -a vcenter.example.com -l cluster-admin -p <password> -z 
  • If fence_vmware_soap is able to list all the managed virtual machines, then try getting the status of one of the virtual machines. There are two ways to do: by virtual machine name or by UUID(which is case sensitive). If these commands give an error, search the Red Hat Customer Portal for that error to identify specific Resolutions to that problem.
# By Virtual Machine Name
# time fence_vmware_soap -o status -a vcenter.example.com -l cluster-admin -p <password> -z -n <vm name>
# By Virtual Machine UUID
# time fence_vmware_soap -o status -a vcenter.example.com -l cluster-admin -p <password> -z -n <UUID>
  • If the fence_vmware_soap fencing agent is able to get the correct status of a virtual machine then try to reboot the virtual machine with fence_vmware_soap to see if that works.
# time fence_vmware_soap -o reboot -a vcenter.example.com -l cluster-admin -p <password> -z -n <vm name>
List of other related troubleshooting articles for the fencing agent fence_vmware_soap:

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments