VM after reboot un-reacheble and un-pingeble
Hi,
We have an strange problem. We have an several VM's. When the VM's start the VM are pingeble and can users login. But when the reboot the machines the users cannot login anymore and they aint pingeble anymore. When the network is disable/ enabled witin the VM it is funtioning again. Same effect when the nic is deactivate/ activate in the manager.
First we thouth that the DHCP server was the problem. And looked at the DHCP server but our 200 fat clients do not have this problems. We have installed an machine with an fixed IP address the same thing happens.
When the VM is shutdown and startup again the vm is reacheble
Responses
I think to be able to help you we need more information:
* What is the guest operating system?
* What version of RHEV are you using?
* Did you tried to ping the guest from the hypervisor where it runs?
* are you rebooting the machines using the RHEV-M site or you rebooting them from inside the VM (init 6)?
Unfortunately, I do not have more idea. We are not using Win guests :(
Did you open a ticket to the support already?
And are you able to access the consoles of these VMs? If so, take a look a the Windows Device Manager to see what the virtual NIC on one of them looks like. Are you using the RHEV virtual NIC or one of the generic ones? And if you can get to the VM's console, can that VM do outbound pings?
Oh yes - Windows 7 - what about personal firewalls inside the VM? You can ping them for a while, then they go silent - what if the VMs are running some kind of scheduled task that sets up a personal firewall or something like that?
Or on another track - what if the NIC driver you're using in these VMs is messed up or corrupted somehow? Did they all come from the same template?
Also take a look at the Event Viewer inside one of these VMs. Maybe some footprints will present themselves here.
- Greg
Can you try with a Linux guest? That would eliminate the windows network+virtio stack being the culprit.
BTW, this happens on all hosts, or just one?
Make sure you're connected to the rhevm network and not your storage network.
Do you see anything in the Event Viewer on your VMs? (Edited - I see you already looked)
Try downloading another copy of the virtio-win ISO just in case the copy you have is messed up somehow. Then update the virtio NIC driver in one of your Win7 VMs.
Oh yes - there are 32 and 64 bit flavors of the virtio NIC driver for XP, Vista, and 7. Make sure you have the right one.
Are your Win7 VMs 32 or 64 bit? Do your VMs have any 3rd party network software? Cisco VPN stuff or anything like that?
Does ipconfig/all on one of the VMs make sense?
(one more edit) - Windows Firewall is disabled, what about 3rd party personal firewalls? Or maybe an over-eager antivirus with a built-in personal firewall?
For what it's worth, I have Win7 virtual machines that work fine. I think mine are 64 bits. The virtio drivers do work. That's why all my troubleshooting suggestions look for external or environmental factors.
(And a 3rd edit) - what if you have a rogue DHCP server? Your 200 physical Win 7 VMs are ok, but your virtual ones have the problem. What if a rogue DHCP server is handling your virtual machines? Maybe the VMs are "closer" to it than the physical machines. I've seen bizarre behavior like this before with physical machines. Look closely at your ipconfig/all report. Look for when the lease is assigned and make sure its getting service from the DHCP server you expect.
Or for that matter - forget DHCP. Take one sample Win 7 VM and just give it a static IP Address. Run it for a while that way and see how it behaves.
And VLANissues - do you have VLANs in your network?
Do you have any Windows Server guests or are they all Windows 7?
- Greg
So the sequence of events is -
1. Cold boot a Win 7 VM and it works fine.
2. Reboot the VM, it's no good on the network. But not always, only most of the time.
3. Connect to its console, disable/enable its virtio NIC, now it's OK until the next reboot.
If so, this does not feel like a host issue. From RHEV-M, you can manage your hosts and you can get at the consoles of your VMs - right? This suggests your hosts and storage and so forth should all be good. Although just to be thorough, live-migrate a VM from one host to another, try a VM reboot and see what happens. I think it will behave the same way no matter which host it's on, but it's easy enough to test to find out for sure.
What is different between a cold boot and a reboot? Some virtual BIOS thing that doesn't get reset properly maybe? I've seen physical PCs that behave differently with a cold boot versus a reboot.
Here are a few things to try that might characterize the problem further.
Pick a sample VM, try a reboot, but tap that F8 key in a console window as it's coming up. Just like with a physical machine, this should get you to the Windows startup menu. From here, boot into safe mode with networking and see if the problem persists.
If the problem goes away when you boot into safe mode with networking, perhaps some other driver is conflicting with your virtio NIC driver.
If the problem persists, try updating that VM's virtio NIC driver from a newly downloaded virtio-win ISO image. Make sure you use the Windows 7 32 bit driver and not the XP or Vista driver. Just for good measure, after you update that NIC driver, shut the VM down. From RHEV-M, go into its settings, get rid of its virtio NIC and add a new one. Boot the VM back up. Plug and Play should find the new virtio NIC and install the driver cleanly.
What is different between your Win 7 VMs and your physical Win7 systems? What are we missing? Are you starting up an app or other driver in the VMs that works fine with physical machines but somehow messes up virtual machines? Safe mode with networking might help with this. Maybe - unless is a trusted driver that also starts in safe mode.
Another difference - the VMs use the virtio NIC driver, the physical machines of course do not. Hmmmm - another diagnostic test - what happens if you remove the virtio NIC from one of your VMs and add one of the generic virtual NICs? Does the problem persist?
Another thought might be to build a Win7 32 bit test VM from scratch. Just build it up - I think you can run those for 30 days without a product key and without activation. Just build up a VM, don't even apply any patches, add the RHEVM tools, get into its console and start pinging. This test seems easy to do and it should not disrupt any of your users. Let's see what happens with a fresh-baked VM with nothing added.
- Greg
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
