VMs unknown state and RHEV-H's Non Responsive
Running latest RHEV-H and latest RHEV-M on RHEL 6.6.
Just did a yum update which updated the following,
Transaction performed with:
Installed rpm-4.8.0-38.el6_6.x86_64 @rhel-x86_64-server-6
Updated subscription-manager-1.12.14-7.el6.x86_64 @rhel-x86_64-server-6
Installed yum-3.2.29-60.el6.noarch @rhel-x86_64-server-6
Installed yum-metadata-parser-1.1.2-16.el6.x86_64 @anaconda-RedHatEnterpriseLinux-201311111358.x86_64/6.5
Installed yum-plugin-versionlock-1.1.30-30.el6.noarch @rhel-x86_64-server-6
Packages Altered:
Updated java-1.7.0-openjdk-1:1.7.0.71-2.5.3.2.el6_6.x86_64 @rhel-x86_64-server-6
Update 1:1.7.0.75-2.5.4.0.el6_6.x86_64 @rhel-x86_64-server-6
Updated openssl-1.0.1e-30.el6_6.4.x86_64 @rhel-x86_64-server-6
Update 1.0.1e-30.el6_6.5.x86_64 @rhel-x86_64-server-6
Updated selinux-policy-3.7.19-260.el6_6.1.noarch @rhel-x86_64-server-6
Update 3.7.19-260.el6_6.2.noarch @rhel-x86_64-server-6
Updated selinux-policy-targeted-3.7.19-260.el6_6.1.noarch @rhel-x86_64-server-6
Update 3.7.19-260.el6_6.2.noarch @rhel-x86_64-server-6
Updated subscription-manager-1.12.14-7.el6.x86_64 @rhel-x86_64-server-6
Update 1.12.14-9.el6_6.x86_64 @rhel-x86_64-server-6
After that and a RHEV-M reboot all, VM's went in unkown state, all hosts (RHEV-H), storage domains, clusters, datacenters went in Non Responsive.
Lots of errors in engine.log, like
2015-01-21 21:55:25,414 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-31) Command GetCapabilitiesVDSCommand(HostName = rhevbohnw01.unix.regionh.top.local, HostId = 4256f307-fbab-4ae7-bdf0-7025d1ecf007, vds=Host[rhevbohnw01.unix.regionh.top.local,4256f307-fbab-4ae7-bdf0-7025d1ecf007]) execution failed. Exception: VDSNetworkException: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
One per Host.
Decided to do a yum undo and without reboot everything is fine again.
Did the same yum update on another environment also with same versions RHEV-M and RHEV-H. Exact same issue here where undo also corrected the problem.
Anybody experienced the same??
Responses
This is not a problem in openssl. In the latest critical security update for openjdk 1.7 SSLv3 has been disabled by default as part of the solution for one of the critical vulnerabilities, see https://rhn.redhat.com/errata/RHSA-2015-0067.html
This is a good thing, however the VDSM daemon seems to have TLS protocol disabled, see http://www.ovirt.org/Features/PKI , hence the handshake fails. Temporary workaround until vdsm is fixed to work with TLS is to comment out
jdk.tls.disabledAlgorithms=SSLv3
from /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre/lib/security/java.security
Enrico Tagliavini
Hi Sven,
There is a good analyze! but I think that Red Hat must be careful about this sort of update.
I fall down my all virtual environnement caused by this issue.
I have open a case #01338370, and I will add your comment for permit red hat enginner to investigate about it.
Thank for your investigate!!!
I've opened a case wuth redhat and turned out I have to downgrade java-1.7.0-openjdk with the command:
yum downgrade java-1.7.0-openjdk
this removes java-1.7.0-openjdk.x86_64 1:1.7.0.75-2.5.4.0.el6_6 and installs java-1.7.0-openjdk.x86_64 1:1.7.0.71-2.5.3.2.el6_6
the I had to issue a ovirt-engine restart
problem solved, but it would have been better if it never happened....
On the RHEV-M machine:
yum downgrade java-1.7.0-openjdk
service ovirt-engine restart
Thanks. That fixed my problem also.
The same problem between Red Hat Storage Console and nodes.
Support didn't had a clue at all...
The support engineer mentioned old log files, network communication while it was clear from my log paste in the bug report this was an SSLException....
My colleague had the same problem with RHEV and that was the only reason we found out the RHS Console problem...
Fix your QA RH!
Oh, we had this fun too today!
"On the RHEV-M machine:
yum downgrade java-1.7.0-openjdk
service ovirt-engine restart"
This resolved for us as well. RHEV QA is a nightmare...
I installed an ovirt 3.5.1 virtualization host and manager machine, on RHEL 6.6. I did not experience the java issue when upgrading to java-1.7.0-openjdk-1.7.0.75-2.5.4.0.el6_6.x86_64.
Paul
"On the RHEV-M machine:
yum downgrade java-1.7.0-openjdk
service ovirt-engine restart"
Did the trick! QA much?
I had this issue as well last night, followed the same to fix:
On the RHEV-M machine:
yum downgrade java-1.7.0-openjdk
service ovirt-engine restart
Fixed it immediately. Thanks guys.
You guys are lifesavers. I panicked and brought down all my cluster preemptively and had no way of bringing it back up. No way to activate any hosts. I tried reaching support, but I found this post before they contacted me. Downgraded and everything is working.
Confimed: Red Hat Enterprise Virtualization Manager Version 3.4.5-0.3.el6ev works fine with SSLv3 turned off.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
