REHV HA

Latest response

Hi,

I'm looking at a doing a pilot in my company Pitney Bowes of RHEV. I've had it set up in my lab for a few weeks and really like it. We are a global VMware site, but in general we use VMs for engineering... not a lot of customer facing mission critical services... but saying that we do use HA and even as bullet proof as VMware say it is because it does not rely on the VCenter server but on elected nodes, I have had problems with it restarting VMs.

So, from what I have read, it sounds like RHEV is reliant on the RHEV manager to restart VMs. So it is a single point of failure, and if you ran RHEV manager as a VM on your cluster and the node it was on failed, it wouldn't get restarted. Just wondering if this is still the case? And if so are there plans to make the HA process more resilient?

As I said I have been reading, but not sure if what I am reading is totally up to date.

Thanks for your input!

Bill

Responses

Did you have a look at the self hosted engine?
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.4/html/Installation_Guide/chap-The_Self-Hosted_Engine.html

ahh, very interesting. I had heard of it, but expected this would be the very problem, that if a self hosted engine went down, there would be nothing to restart it, but this sounds excellent, Thanks! I'm likin RHEV more and more.

HI Bill - you have touched on a hotly debated topic of comparing RHEV to "other" solutions ;-) I'm not currently as fluent with RHEV's capabilty as I once was... but I can still talk to these topics (I think).

Q: Is there DRS for RHEV? Correction: SRM for RHEV... not DRS. (it's been a while and I'm getting old I guess)
A: Sort of. Here is a white-paper discussing the approach for doing a site-to-site replication (and recovery) of RHEV
https://access.redhat.com/sites/default/files/attachments/2012_rhev_3_dr_0.pdf

Q: Can you make the RHEV-Manager Highly-Available?
A: Absolutely - You will find numerous posts about RHEV and clustering (which is not possible with VMware - I believe)
https://access.redhat.com/articles/216973

Q: Can VMs be managed while RHEV Manager is not available?
A: Certainly not as easy as using a native client (like another solution), but I believe it is possible. Some folks had asked whether RHEV-Manager could be setup as a guest on one of the Hypervisors (which leads me to believe that it is possible)
https://access.redhat.com/discussions/680473

I was previously much better at comparing the 2 products and I am likely not doing RHEV justice as they have made signficant improvements since 3.0/3.1. If your shop is engineering focused and relies on process, I believe you could match the capabilities of the products fairly well (i.e. you will need time to test DR/incident strategies and develop an SOP for them - as opposed to setup a product and rely on it to respond for you).

Thanks James, very interesting and lots of good reading I can see. Thanks! I really want to get the Pilot up and running so I can get people using it, break it and fix it - same as I did with VMware so that when something happens, I know what to do about it. I used to be a bit of a Linux guru about 10 years ago and then my small company decided to go all Windows, but now after being acquired by a company that is far more agnostic when it comes to operating systems, I am really enjoying myself again.

So - a few bits of advice:
* be sure that when you present RHEV, that you accurately convey what it can do. I have been at a shop where it was sold as something it wasn't - then when it came time to actually use some features that were promised (by our own folks) and RHEV could not deliver, the reputation of RHEV was a bit tarnished (and quite unfairly).
* I would try to avoid the discussion "product X can do this, what can RHEV do?" - even as part of the sales cycle, the discussion was generally useless. I instead tried to focus on what the customer needed to see whether RHEV could meet that need.

RHEV is a great product and can "hold it's own" - and I'm glad that it competes with the other vendors in that space.. it makes for a better product. It's not always the right solution for every customer/situation.

EDIT: I wasn't implying that you would do any of the things I had mentioned. It's tough to respond in a hypothetical way without using "you" or "me/I" ;-)

Hello William

Q: Is there DRS for RHEV?
If you are looking for a full failover solution for yor RHEV environment, things are a little simpler now than this document suggests. We have successfully deployed RHEV on replicated FC based storage using IBM Metromirror and designed it such that we can "failover" to our standby datacentre in the event of a catastrophic failure in our primary. We had the help of RH services to design, test and gained support approval for our environment.
There were a couple of changes required upstream to account for the change of LUN WWPN (which are now in mainline 3.4) , our RHEV Managers sit on physical tin as opposed to VM and its an "all or nothing failover", but it fits our requirement.
I'd be happy to pass on any detail.

sorry, been busy - beta testing VSphere 6... all good fun.

I think I mislead you all a bit on the DRS. I was talking about Distributed Resource Scheduler (VMware name). The process that moves VMs to the best host in a cluster and places them on startup on the best host. I think RHEV does that already... as far as Disaster Recover - SRM, Site Replication is probably what the comments here would refer to in VMware terms. I don't currently need any form of Disaster recovery other than I will need a VM backup solution. It looks like Acronis and others can provide that.

Actually - I think it was I who derailed the conversation with DRS in my response. So many TLAs!!!

Richard,

Care to elaborate or send details directly? We're working on a similar solution with emc mirror-view.

Of course David. Drop me a line at richard.davis@uk.pgds.com and I'll get some stuff together.
I did promise my TAM that I'd write up a KB article on this topic , so will force the issue ;)

As stated about, I'm not really looking for disaster recovery at the moment... I did work a bit with Metro Clusters when I was in the SaaS team working on production and the EMC VPlex, not sure if RHEV is going to offer anything like that, but now that I am in Engineering, we don't have nearly the pressure for providing up time on our VMs to customers,,, so as long as I can get VMs to restart if a host goes down, I'm very happy.

The self-hosted would do the trick for me - if I have to move to another environment running only RHEV. At the moment, I have the luxury of putting my RHEV-M on a VMware cluster and HA on that cluster restarts it if it goes down :-) I was thinking of putting the VMware Vcenter on my RHEV cluster so it could do that same!

I wonder how many folks have done what you are proposing. I imagine the vCenter server running on Linux is GA now. I bet it would run just fine on RHEV. Good luck ;-)

Hi all

This is a very interesting discussion.
I'm in the middle of setting 18 servers up based on RHEV 3.4 and using mirrored IBM SAN (IBM SVC). Some of them also with local storage.
Not without a few moments of frustration though. But seems now I got it done so it's acceptable (not perfect) HA wise. Learned a lot which is great.

Using FC is in my understanding a bit different than iscsi HA-wise. Seems that iscsi works right out of the box while FC needs some manual work in case you loose all FC on one Host. Have to calculate all kind of scenarios in, right. So that's what I've been testing today.
This is why I'm also very interested in knowing if Red Hat is working on improving the HA feature when using FC.
I'm not using like a primary and a secondary standby site. I have for the FC setup 2 active sites. Is this what you have been testing?

So far what I have seen using FC, I wouldn't run a self-hosted RHEV-M. So our RHEV-M is running on VMware. Also we're using RHEV-H and I remember I once read that self-hosted RHEV-M is not possible on such.

Stig
We run an primary/standby RHEV environment as we have no "stretched" FC SAN option, only replicated (IBM MetroMirror) but I could find no technical reason why an active/active setup wouldn't work though.

I should mention that we have our RHEV Manager on physical tin and its storage (including boot lun) is also replicated between location allowing us to boot it in either to manage the environment. Redundant tin is obviosuly required to enable this.

Hi Richard

Ok, thanks for info.
Figured out that loosing SAN for our environment caused the VM's to go 'paused' to protect themselves from data corruption. That was a bit of a bummer...or what do you call it :-)
However did a lot of tests and figured out I can manually shutdown the paused VM and migrate afterwards.
So not as great as if one uses iscsi where failover is completely automatically, but for us still an acceptable solution.
Looking very much forward to see what improvement we will see in future releases.
But thanks for the inspiration regarding the rhev-m.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.