A fix for the udev eth1 issue, take back your eth0.
I don't know if this has been stated before (I searched => nothing)
I have been working on fixing the dreaded udev eth1 issue, and after some thought and discussion with my team we figured out how to circumvent udev to give us eth0 on clones of vm's. I have searched high and low to see if there was a way to fix this issue (google => no, startup/shutdown script => no) so I finally stopped thinking like an administrator and looked at the problem differently. The problem was not with the cloned machines but with the template itself.
My solution:
To fix the problem all you have to do is "change the template's network interface"
In my case, I changed the interface on the template to eth5 (because we won't have more than 4 interfaces on the system)
How to change:
modify your /etc/sysconfig/network-scripts/ifcfg-eth0 file to /etc/sysconfig/network-scritps/ifcfg-eth5 (or whatever number you want, just not eth0)
make sure you change the contents of the ifcfg-eth5 file to reflect eth5
then modify you /etc/udev/rules.d/70-persistent-net.rules file to reflect that you now what to rename the interface that has the HW address to eth5
reboot your template and when it comes back up you should see that the interface is now eth5.
You are now ready to clone vm's while they are up or down, doesn't matter your interface on the cloned server will come back as eth0
Responses
I think I know what you are referring to, but could you provide some additional detail (or history) to the "issue" section? It will help others find this thread. I like the format of this post though (issue - solution (summary) - solution (details))
This thread made me also wonder.. what happens if you simply remove 70-persistent-net.rules (from the template) and let it rebuild itself on the first boot after it is cloned? (My assumption is based on another assumption that the file is rebuilt by udev triggers - but I could be wrong).
Thanks for the post.
James, I've removed the 70-persistent-net.rules (on a previous case with Red Hat) and it reappears on the next boot (on rhel 6)... you probably knew that.
I did that with a system that was moved/imported to another hypervisor under virtualization. I've had to play games with these files on KVM systems on occasion that will not pick the proper nic when I reintroduce them to the host system after a reload.
I feel your pain with having the NIC show up as eth1 when introduced to the host for the first time.
I know this is likely a completely human take on the problem, but it bothers me when the host does not use eth0.. probably to a degree that is not normal ;-) So, even if I didn't have something like Puppet managing that aspect of my environment, I would still tirelessly find a way to resolve the issue! ;-)
So - regarding the udev rules: If you were to create your template with 70-persistent-net.rules missing (and comment out HWADDR and UID in ifcfg-eth0), then deploy a new host - does it still create the device as eth1?
I wonder if the steps can be solid enough to include on this doc
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.0/html/Evaluation_Guide/Evaluation_Guide-Create_RHEL_Template.html
When we spin VM templates, we have a template prep-script that we run against them before converting the source VM to template. One of the things that script does is nukes out any persistent device bindings (e.g., nulls out the 70-persistent-net.rules). That way, on first-boot of VMs made from the template, any device nodes get mapped appropriately. If you have either a standard process or script out those standard processes, you won't run into the eth1 problem.
We took this approach to Linux as it was a simple cognate to what had to be done for Windows (actually, a lot less of a pain in the ass than cleaning up Windows templates' dev-trees).
In general, we use a dated "platform" release cycle for our Linux systems. Due to our tenants' security requirements, our tenants have to test deployments against a given patch-cluster and then deploy exactly that certified cluster. Thus, we tend to retain 2-3 standard builds from which tenants systems can be patched-up to a date-targeted patch-state. All of this is automated by our (soon to be retired) automated CM/build/patching framework. For Kickstarted systems, an OS sequence is run and a date-targeted patch-bundle is applied via a post-Anaconda job. For VMs, once their VM's "run-once" script executes and the VM is joined to the CM/build/patching framework, it requests a date-targeted patch-bundle from the CM/build/patching framework and the framework complies. Specifically, the VM's script lets the framework know "ready to be patched" and the framework applies the patch bundle associated to that tenant's systems.
Basically, we've created orchestrated build workflows that de-emphasize the need for maintaining patched-up builds or templates. With the nature of our tenants' security directives, we really didn't have any choice but to do so (having dozens of templates distributed across dozens of vCenter installations scattered around the globe and across multiple, not interconnected networks, would have been too onerous).
This is just how RHEL6 makes network device names persistent. You don't need HWADDR at all in RHEL6, as the device is already properly named by udev. In my own templates I just remove the udev rules, and leave eth0 getting its address from DHCP with no HWADDR.
We supply the command sys-unconfig for use sealing VM images for templating.
It's just a script, though not a very complex one :)
$ rpm -qf $(which sys-unconfig)
initscripts-9.03.40-2.el6_5.1.x86_64
$ cat $(which sys-unconfig)
#!/bin/sh
. /etc/init.d/functions
if [ $# -ne 0 ]; then
echo $"Usage: sys-unconfig" >&2
exit 1
fi
touch /.unconfigured
rm -f /etc/udev/rules.d/*-persistent-*.rules
halt
Heh... Solaris has a similar "tool". I'd assumed our build just didn't have an equivalent tool installed (a lot of RPMs are removed as part of the hardening procedures), thus why they weren't using such a tool. If the guys that originally prescribed our template process were still there, I'd ask they why they reinvented the wheel. =)
Actually, just read the one link. Looks like the folks who originally defined our template build-process essentially prescribed all the same things in that article, but opted to manually nuke the udev rules as part of their larger prep-script.
One down side of just leaving the VM as "sealed" according to the linked article is you still end up with "leftover" log entries. Yes, the article says to optionally clear the logs, but the shutdown process will still tend to create more entries on the way down. So, the anal-retentive side of me tends to opt to nuke those log files outside of a normal boot-epoch.
Yeah. Part of our template creation process is to install a "run-once" file that automagically takes care of those annoyances.
We further opted to make the run-once file VERY basic. All it does is pull down the actual run-once tasks/scripts (the last of which notifies the CM server of the VM's "ready to patch" state) that are hosted on a web server. That way, rather than having to respin the template to push "run-once" changes into the template, the deployed VM simply grabs the most up to date collection of "run-once" scripts. Also makes it a lot easier to have one template-method that works against multiple EL releases (since our environment includes RHEL 5, RHEL 6 and CentOS 6 physicals and VMs).
Basically, it makes deploying Linux VMs a lot more like deploying Windows VMs (since the Windows deployment process allows you to specify a run-once operation as part of the deployment sysprep process).
Ahhh, I see, I didn't know that about VMWare cloning.
Do the virtual NICs always end up in the same PCI bus location? If so, you may be able to name your network devices by PCI bus address instead.
Not always, no - particularly if you've multi-NIC VM.
We simply got into the habits we did because of the need to support multiple-platforms. Using a consistent (and sufficiently abstracted) methodology makes it so you don't really have to think about it, any more - and protects you against within-platform changes that can creep in with platform version updates.
Chances are, were we a pure-RHEL/Linux shop, our procedures (and scripts) would be less pedantic.
Thanks Everyone,
This is a useful thread. It gives me ideas for the clones we create. Tom's idea of time-stamping clones with a new date for the basesystem rpm is useful with cloned systems...
Sounds like it is the week for VM clones.. I am going through the process of stripping down another base image for VMware template based deployment, so the contributed info will be handy!
I think it would be of value to start a new thread discussing stripping an OS for template creation (or extending sys-unconfig.. without the .unconfigured creation).
Other things I am covering on top of NIC (sorry, slightly offtopic) is removing SSH host key, log files and /etc/sysconfig/rhn/systemid.. I am sure there are plenty more bits and pieces people have tucked away (resizing swap on first boot?).
Great to see I am not the only one fighting VMware and it's NIC configuration/cloning process!
Pixel, you are certainly not alone in experiencing this
Yeah, I had that today under vmware, 70-persistent-net.rules vs ifcfg-eth1 created by the clone wars... (and I've seen the same thing on workstations running KVM systems moved/cloned/ingested)
I like your idea of a new thread to discuss that...
I just built/deployed a file today that resolves a authentication and time sync issue we run once in a while, but the script differs based on a subnet. I made a script that looks at what the gateway is then downloads the proper script based on the gateway... and tested it on a subset of about 30 systems and it worked, mercifully.
Once you have some ideas together (possibly even patches) you're more than welcome to log a bug against initscripts to improve sys-unconfig. Please let me know if you do, I'd be interested to follow it.
You also might want to check out what Rich Jones has done with the libguestfs-tools like virt-sysprep. This does similar things to what sys-unconfig does, but does it against KVM virt images using Python, Perl, and C.
As it stands, the existing unconfig script makes sense from the standpoint of not making assumptions. That said, if you didn't want to assume that SSH host-keys needed to be nuked-out (etc.), you could always put logic-blocks in that check for specific vendor-RPMs. If the RPM's missing, skip the procedure; if it's there, do the cleanup steps appropriate to deconfiguring that RPM's associated service(s).
Long time ago, but still troublesome. i stumbled over this today when preparing some templates running RHEL 6.8 and created machines from them (with a single NIC only)
I ended up with doing to things within the template
a) removed the file /etc/udev/rules.d/70-persistent-net.rules
b) removed the line HWADDR.... from /etc/sysconfig/network-scripts/ifcfg-eth0.
When i then created a machine from that template the NIC was configured properly and got eth0 as identifier.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
