eth0 silliness after cloning rhel 6 guest in vmware

Latest response

I cloned a Rhel 6 guest in vmware and the boot process would hang at the "starting eth0" step. After a few googles, I found numerous posts that I simply have to regenerate the /etc/udev/rules.d/70-persistant-net.rules file, and provide the new (correct) MAC address in my  /etc/sysconfig/networking/devices/ifcfg-eth0  file. Well, this didn't resolve my issue 100% - I still hang (sometimes) during the boot process. If I do the grub-single-user trick, I can successfully type "service network start" and it comes up fine; However, after a reboot - I still have a 50/50 chance of the standard boot hanging.

 

After removing the 70-persistant-net.rules file and rebooting, I tried updating BOTH the /etc/sysconfig/networking/devices/ifcfg-eth0 AND THE  /etc/sysconfig/network-scripts/ifcfg-eth0 file. I have tried removing the HWADDR entries entierly, and also updating them with the correct MAC that is in the newly generated 70-persistant-net.rules file.

 

One post I read said to remove the UUID entry as well. 

 

Am I missing a step?

 

Thanks,

 

...steve...

Responses

UUID is only used by NetworkManager. If you're running headless, or the eth0 config never changes, there's no need for that interface to be controlled by NM.

 

Try disabling NM (chkconfig NetworkManager off; chkconfig network on) if you don't otherwise use it. If you boot into a graphical runlevel, set "NM_CONTROLLED=no" in your ifcfg-eth0 file, because the graphical NM applet can still load and start interacting with the interfaces.

 

Make sure you've got the correct VMWare Tools for your ESX/vSphere installed. I believe if you've updated the kernel since Tools installation, you'll need to do a reinstall of Tools.

 

We also have some previous cases where this behaviour was seen with the emulated "e1000" NIC type, but not with the "Flexible" or "vmxnet" NIC types.

Jamie - 

 

Thanks for the tips; I am improving. After turning off  NetworkManager and making the change to my ifcfg-eth0 script, I successfully boot exactly 50% of the time. Hence, something is being set or created during a successful boot that prevents the next boot from succeeding. When the boot fails (waiting for eth0 to initialize) I can hit the red vmware "Shut Down Guest" button (not gracefull, but it's the only tool I have at  that point!). Then when I restart (by hitting the green "Power On" button, it will successfully boot. I don't even have to go into single user mode and remove the 70-persistent-net.rules file.

 

I also did a "yum update", rebooted, and reloaded vmware-tools.

 

Question: My server isn't in my DNS yet - could that be an issue? I'm just using a static IP, no DHCP or other preboot proto.

 

Here is my eth0 boot script if that helps (PS: The HWADDR DOES MATCH the MAC shown in vmware):

DEVICE=eth0
NM_CONTROLLED=no
ONBOOT=yes
TYPE=Ethernet
BOOTPROTO=none
IPADDR=10.10.12.124
PREFIX=8
GATEWAY=10.10.10.1
DNS1=10.10.12.15
DOMAIN=mycompany.com
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
NAME="System eth0"
NETMASK=255.0.0.0
USERCTL=no
LAST_CONNECT=1354027044
HWADDR="00:0C:29:90:94:A8"

It's good that we at least have consistent behaviour now.

 

I'm wondering if something in the initial ram filesystem (what init uses before it passses boot over to the permanent storage) is causing the problem. Try rebuilding the initramfs with dracut, make sure you create a backup copy first, incase the rebuild doesn't work:

 

 How do I rebuild the initial ramdisk image in Red Hat Enterprise Linux?

 https://access.redhat.com/knowledge/solutions/1958

 

If we have no luck here, you would probably be best to open a support case. Please refer to this thread in the case description, and feel free to ask to case owner to add me to the case team.

Oh, and the presence of the server in DNS or not shouldn't make a difference.

 

All we're doing here is bringing up an interface and giving it an IP address. This is a function of the network service (if you're keen, you can start reading at /etc/init.d/network and follow the logic, it's just a bash script) and there's no DNS query involved.

Okay - So I have another interesting datapoint: I simply built a brand new (rhel6) VM to see if my problem went away - it did not! Hence its not related to the cloning process. Looking further, I now see that (for some strange reason) I'm using the "VMXNET3" Nic in my new (and clonded) VMs. My tried-and-true VMs (that have been running fine for 8-12 months) are all "E1000". I have no explanation how come VMXNET3 is now my default - I never even played with these options when creating new VMs, I just take the default value.

 

So, how can I change my NIC back to an E1000? Should I simply add a new NIC for the VM, and then delete the old one? Will RHEL then see this new device and get the apprpriate driver? Or, do I have to go through some song-and-dance such as:

 

  1. Remove all the references to my MAC in my eth0 boot script
  2. Remove the "70-persistent-net.rules"
  3. Shutdown
  4. Go into vmware and add the new (E1000) NIC - and document the MAC
  5. Go into vmware and remove the old (VMXNET3) NIC
  6. Reboot and make sure that the new MAC appears in the new 70-persistent-net.rules file

Thanks for all your help!

> So, how can I change my NIC back to an E1000?

 

Allow me to suggest that you really don't want to do that unless it's a last resort.  VMXNET3 is far more efficient and faster and more stable, but you do have to assure the proper VMware Tools are installed so the current VMXNET3 drivers are included in your boot files.

 

I've added a block like this to my rc.local files on VMware guests so I don't have to remember to manually reload Tools when the kernel packages get updated:

 

if [ ! -e /lib/modules/`/bin/uname -r`/misc/.vmware_config_has_been_run ]; then
    /usr/bin/vmware-config-tools.pl --default
    /bin/touch /lib/modules/`/bin/uname -r`/misc/.vmware_config_has_been_run
    /usr/bin/logger -t rc.local vmware-config-tools run for `/bin/uname -r`, restarting
    /sbin/telinit 6
fi

 

Otherwise what you have done so far is much like we do here for clones.  I've only changed the ifcfg-ethx files and not bothered with the rules file and I have not had problems like those you describe.  Have you checked for error messages in the VMware host logs?  Perhaps your guest is having trouble connecting to the virtual switch on the host or other such nonsense.

Ah, I'm glad you have found the source of the problem!

 

I'd assume VMWare picks vmxnet3 as the default interface when creating a new VM. Depending on the OS type you specify, KVM and RHEV will do the same with virtio interfaces.

 

The steps you've outlined are the correct way to change the NIC type. You might be able to just shutdown the VM and alter the NIC type from vmxnet3 to e1000 and keep the same MAC address, instead of adding a new interface and deleting the old one, I'm not sure (I haven't played with VMWare for a few major versions).

 

That being said, a new VM with a vmxnet3 interface really should work. As this will be using VMWare's driver (from VMWare Tools) you might want to raise this up with VMWare. If we can be of assistance, do open a case for us to discuss via yourself, or we can talk directly to VMWare on your case with them via TSANet.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.