RHEV 3.x host keeps changing status from up to down, or SPM switches over to another host repeatedly while logs show "An invalid XML character"

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Virtualization 3.0 - 3.2
  • Guest agent running in at least some VMs
    • Observed so far with Windows VMs, but theoretically possible with any VM running a guest agent.

Issue

  • A RHEV host keeps cycling between "Up" and "Down" status
    OR
  • The SPM switches over to another host repeatedly

  • The engine logs "SAXParseException: An invalid XML character":

2013-03-12 13:25:38,100 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] (QuartzScheduler_Worker-37) [2b9396f] XML RPC error in command GetAllVmStatsVDS ( HostName = rhevh.example.com ), the error was: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException, SAXParseException: An invalid XML character (Unicode: 0xffff) was found in the element content of the document. 
2013-03-12 13:25:38,176 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-37) [2b9396f] ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = 571fb483-d19b-4364-8009-d59636167cd6 : rhevh.example.com, VDS Network Error, continuing. VDSNetworkException: 

Resolution

  • This issue is fixed by RHSA-2013:1155-2
  • Upgrade to vdsm-4.10.2-24.0 or greater.

Root Cause

  • The guest agent supplies vdsm with data supplied verbatim by the guest OS
    • This data may contain non-ASCII strings, e.g. internationalized application or user names
    • Under certain circumstances (e.g. Windows application writes data to the registry using the wrong code page) it may also contain garbled or incorrectly encoded data
  • vdsm did not correctly sanitize the (untrusted) information provided by the guest agent
    • This could lead to vdsm using characters not permitted in XML in its XML-RPC communication with RHEV-M
    • RHEV-M treats the invalid XML-RPC messages from vdsm as a communications / network error

Diagnostic Steps

  • Check engine.log for "invalid XML character" messages:
2013-03-12 13:25:38,100 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] (QuartzScheduler_Worker-37) [2b9396f] XML RPC error in command GetAllVmStatsVDS ( HostName = rhevh.example.com ), the error was: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException, SAXParseException: An invalid XML character (Unicode: 0xffff) was found in the element content of the document. 
2013-03-12 13:25:38,176 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-37) [2b9396f] ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = 571fb483-d19b-4364-8009-d59636167cd6 : rhevh.example.com, VDS Network Error, continuing. VDSNetworkException: 
  • Check server.log for messages similar to:
2013-03-12 13:33:46,392 ERROR [stderr] (pool-3-thread-50) [Fatal Error] :1492:22: An invalid XML character (Unicode: 0xffff) was found in the element content of the document.
2013-03-12 13:33:58,417 ERROR [stderr] (pool-3-thread-49) [Fatal Error] :1492:22: An invalid XML character (Unicode: 0xffff) was found in the element content of the document.
2013-03-12 13:34:13,778 ERROR [stderr] (pool-3-thread-34) [Fatal Error] :1492:22: An invalid XML character (Unicode: 0xffff) was found in the element content of the document.
  • If this error is occuring the following command will fail as shown on a host which is in the bad state.
[root@rhevh ~]# vdsClient -s 0 list table
Traceback (most recent call last):
  File "/usr/share/vdsm/vdsClient.py", line 2379, in <module>
    code, message = commands[command][0](commandArgs)
  File "/usr/share/vdsm/vdsClient.py", line 273, in do_list
    response = self.s.getAllVmStats()
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1387, in _parse_response
    p.feed(response)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 601, in feed
    self._parser.Parse(data, 0)
ExpatError: not well-formed (invalid token): line 30531, column 21

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.