VM after reboot un-reacheble and un-pingeble

Latest response

Hi,

We have an strange problem. We have an several VM's. When the VM's start the VM are pingeble and can users login. But when the reboot the machines the users cannot login anymore and they aint pingeble anymore. When the network is disable/ enabled witin the VM it is funtioning again. Same effect when the nic is deactivate/ activate  in the manager.

First we thouth that the DHCP server was the problem. And looked at the DHCP server but our 200 fat clients do not have this problems. We have installed an machine with an fixed IP address the same thing happens.

When the VM is shutdown and startup again the vm is reacheble

Responses

I think to be able to help you we need more information:

* What is the guest operating system?

* What version of RHEV are you using?

* Did you tried to ping the guest from the hypervisor where it runs?

* are you rebooting the machines using the RHEV-M site or you rebooting them from inside the VM (init 6)?

Thanks for the quick response.

 

but the

Os = win7 pro

RHEV  3.1.0-43.el6ev

From the Hypervisor i get 100% packet loss

And it is after reboting wwithin the VM. Whem the VM shutdown and started again there is no problem.

 

Do you have the rhev guest tools installed on your guests?

Yes version 3.1.9

Unfortunately, I do not have more idea. We are not using Win guests :(

Did you open a ticket to the support already?

No not yet.

But i wil open it write away

 

thanks for the response

 

 

------------------

Made a support call

And are you able to access the consoles of these VMs?  If so, take a look a the Windows Device Manager to see what the virtual NIC on one of them looks like.  Are you using the RHEV virtual NIC or one of the generic ones?  And if you can get to the VM's console, can that VM do outbound pings? 

Oh yes - Windows 7 - what about personal firewalls inside the VM?  You can ping them for a while, then they go silent - what if the VMs are running some kind of scheduled task that sets up a personal firewall or something like that?

Or on another track - what if the NIC driver you're using in these VMs is messed up or corrupted somehow?  Did they all come from the same template?

Also take a look at the Event Viewer inside one of these VMs.  Maybe some footprints will present themselves here. 

- Greg

Greg

 

Iám able to connect throu the admin portal and the user portal.

The nic is Red Hat VirtIO eth adapter version 60.63.103.2700

Firewall on the win7 is disabled. Ping's from the VM to an server get host unreacheble

I have some machine build from an template an one build from scratch. Both have the problem.

 

Eventlog give no error's

 

thanks

 

Bart

Do you see the same issue with a RHEL guest?

We only have Windows 7 guests.

Can you try with a Linux guest? That would eliminate the windows network+virtio stack being the culprit.

BTW, this happens on all hosts, or just one?

Make sure you're connected to the rhevm network and not your storage network. 

Do you see anything in the Event Viewer on your VMs? (Edited - I see you already looked)

Try downloading another copy of the virtio-win ISO just in case the copy you have is messed up somehow.  Then update the virtio NIC driver in one of your Win7 VMs. 

Oh yes - there are 32 and 64 bit flavors of the virtio NIC driver for XP, Vista, and 7.  Make sure you have the right one. 

Are your Win7 VMs 32 or 64 bit?  Do your VMs have any 3rd party network software?  Cisco VPN stuff or anything like that?

Does ipconfig/all on one of the VMs make sense?

(one more edit) - Windows Firewall is disabled, what about 3rd party personal firewalls?  Or maybe an over-eager antivirus with a built-in personal firewall?

For what it's worth, I have Win7 virtual machines that work fine.  I think mine are 64 bits.  The virtio drivers do work.  That's why all my troubleshooting suggestions look for external or environmental factors.

(And a 3rd edit) - what if you have a rogue DHCP server?  Your 200 physical Win 7 VMs are ok, but your virtual ones have the problem.  What if a rogue DHCP server is handling your virtual machines?  Maybe the VMs are "closer" to it than the physical machines.  I've seen bizarre behavior like this before with physical machines.  Look closely at your ipconfig/all report.  Look for when the lease is assigned and make sure its getting service from the DHCP server you expect. 

Or for that matter - forget DHCP.  Take one sample Win 7 VM and just give it a static IP Address.  Run it for a while that way and see how it behaves.

And VLANissues - do you have VLANs in your network?

Do you have any Windows Server guests or are they all Windows 7?

- Greg

I can test it. i can in stall it on monday.

 

And it socuring on all the vm's.

Most users they start an VM and shutdown an VM.  So they don't notice the problem. Only when they reboot.

 

 

 

The network is working working correct.

I have one machine installed with the windows firewall disabled and no virusscanner or other third party software.

Thew virtio drivers when startup working fine. When rebooted 8 of 10 reboots the nic isn't connecting correctly. (so so two times it is working when rebooted agian it is failing) When i log in to the VM localy and disable the nic and enable the nic is is working again. untill we reboot the machine. When we shutdown the VM and run the VM it is working fine.

The VM's are all win7 pro 32 bits.

I don't think its an dhcp problem becaurse we have the same error with an static ip address. When an VM is getting the ip-address from an DHCP server the lease is 365 days so almost static.

We have VLAN active but not for the servers and VM's they are located in the same vlan.

So the sequence of events is -

1. Cold boot a Win 7 VM and it works fine.

2. Reboot the VM, it's no good on the network. But not always, only most of the time.

3. Connect to its console, disable/enable its virtio NIC, now it's OK until the next reboot.

If so, this does not feel like a host issue.  From RHEV-M, you can manage your hosts and you can get at the consoles of your VMs - right?  This suggests your hosts and storage and so forth should all be good.  Although just to be thorough, live-migrate a VM from one host to another, try a VM reboot and see what happens.  I think it will behave the same way no matter which host it's on, but it's easy enough to test to find out for sure. 

What is different between a cold boot and a reboot?  Some virtual BIOS thing that doesn't get reset properly maybe?  I've seen physical PCs that behave differently with a cold boot versus a reboot. 

Here are a few things to try that might characterize the problem further. 

Pick a sample VM, try a reboot, but tap that F8 key in a console window as it's coming up.  Just like with a physical machine, this should get you to the Windows startup menu.  From here, boot into safe mode with networking and see if the problem persists. 

If the problem goes away when you boot into safe mode with networking, perhaps some other driver is conflicting with your virtio NIC driver.

If the problem persists, try updating that VM's virtio NIC driver from a newly downloaded virtio-win ISO image.  Make sure you use the Windows 7 32 bit driver and not the XP or Vista driver.  Just for good measure, after you update that NIC driver, shut the VM down.  From RHEV-M, go into its settings, get rid of its virtio NIC and add a new one.  Boot the VM back up.  Plug and Play should find the new virtio NIC and install the driver cleanly. 

What is different between your Win 7 VMs and your physical Win7 systems?  What are we missing?  Are you starting up an app or other driver in the VMs that works fine with physical machines but somehow messes up virtual machines?  Safe mode with networking might help with this. Maybe - unless is a trusted driver that also starts in safe mode. 

Another difference - the VMs use the virtio NIC driver, the physical machines of course do not.  Hmmmm - another diagnostic test - what happens if you remove the virtio NIC from one of your VMs and add one of the generic virtual NICs?  Does the problem persist?

Another thought might be to build a Win7 32 bit test VM from scratch.  Just build it up - I think you can run those for 30 days without a product key and without activation.  Just build up a VM, don't even apply any patches, add the RHEVM tools, get into its console and start pinging.  This test seems easy to do and it should not disrupt any of your users.  Let's see what happens with a fresh-baked VM with nothing added.

- Greg

Greg,

Sorry for the late response.
Your three points are correct.

When i live migrate the vm isn't pingeble.
reboot doesn't change anything.

When I start an VM in Safe mode the vm isn't pingeble until i login to the VM and the NIC is visible in the taskbar.

I've downbgrade on the two test machines the Virtio NIC to version 60.62.102.3000 of the virtio-win-1.4.0.iso and it is function well.
I've rebooted the vm serveral times. SO it looks that it is an Driver problem.

I've created an support call last friday so i will add this finding to the call.

I will keep you posted.

 

Bart

Hi Bart,

What IP address can be seen in the manager when a vm isn't reachable? Is it a 169.x.x.x IP address?

-- Vincent