engine is not communication with hosts
hi everyone, in my setup there are 4 hosts and 3 storage domains. 2 are fc connected and one is nfs for iso images. my vdsm certificates are expired.
in engine, the status of 3 hosts is "connecting", and for the fourth host the status is "non responsive"
all the storage domains are down, datacenter is down
vdsm service output: "vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: sslv3 alert certificate expired"
libvirtd service output: "virNetSASLSessionServerStep:586 : authentication failed: Failed to start SASL negotiation: -5 (SASL(-5): bad protocol / cancel: StoredKey mismatch)"
ovirt-ha-agent service output: "ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 191, in _run_agent
return action(he)
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 64, in action_proper
return he.start_monitoring()
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 410, in start_monitoring
self._initialize_broker()
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 525, in _initialize_broker
m.get('options', {}))
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 139, in start_monitor
.format(type, options, e))
RequestError: Failed to start monitor ping, options {'addr': '10.x.x.x'}: Connection timed out"
ovirt-ha-broker service output: "ovirt-ha-broker ovirt_hosted_engine_ha.broker.submonitor_base.SubmonitorBase ERROR Error executing submonitor mgmt-bridge, args {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitor_base.py", line 115, in _worker
self.action(self._options)
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors/mgmt_bridge.py", line 42, in action
logger=self._log
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 187, in connect_vdsm_json_rpc
requestQueue=requestQueue,
File "/usr/lib/python2.7/site-packages/vdsm/jsonrpcvdscli.py", line 248, in connect
responseQueue)
File "/usr/lib/python2.7/site-packages/vdsm/jsonrpcvdscli.py", line 234, in _create
lazy_start=False)
File "/usr/lib/python2.7/site-packages/yajsonrpc/stompreactor.py", line 622, in StandAloneRpcClient
reactor = Reactor()
File "/usr/lib/python2.7/site-packages/yajsonrpc/betterAsyncore.py", line 204, in init
self._wakeupEvent = AsyncoreEvent(self._map)
File "/usr/lib/python2.7/site-packages/yajsonrpc/betterAsyncore.py", line 160, in init
self._eventfd = EventFD()
File "/usr/lib/python2.7/site-packages/vdsm/common/eventfd.py", line 61, in init
self._verify_code(fd)
File "/usr/lib/python2.7/site-packages/vdsm/common/eventfd.py", line 111, in _verify_code
raise OSError(err, msg)
OSError: [Errno 24] Too many open files"
hosted-engine --vm-status: for one host "up", for second host "unexpectedlydown", for third host "starting", for fourth host "down"
im uable to manage vm4 from admin portal. i tried to turnoff one of the vm but no response and i cant find the vm details using "virt list -all" on all the hosts. this vm4 is not shown in virsh output, i executed the "virsh list -all" on all the four hosts. please help me.
i can confirm that my vdsm certificates are expired!.