Can't PXE boot VM with a network interface using a macvtap direct connection

Solution Verified - Updated -

Environment

  • RHN Satellite with provisioning and PXE capabilities
  • RHEL6.3 hypervisor with RHEL6.3 VM guests

Issue

  • We configured satellite with cobblerd+tftp for provisioning, and our VMs cannot PXE boot giving timeout while it download "pxelinux.0"
  • The VMs are able to get a DHCP address ok, but can't download the pxelinux.0 file.
  • tftp works fine with "tftp tftp-server -c get pxelinux.0" for physical machines
  • We tried with virtio, e1000 and rtl8139 network interfaces all give the same error.
  • We have verified DNS and DHCP are configured correctly

  • Error as seen when booting the VM:

Waiting for link-up on net0... ok
DHCP (net0 52:54:00:93:76:bb).... ok
net0: 10.20.2.250/255.255.255.0 gw 10.20.2.1
Booting from filename "pxelinux.0"
tftp://10.20.2.220/pxelinux.0.............. Connection timed out (0x4c126035)
Could not load tftp://10.20.2.220/pxelinux.0: Connection timed out (0x4c126035)
No more network devices

Resolution

Until an official resolution is released a workaround is to disable TX checksum offloading on the Satellite.

If eth0 is used as the outgoing interface on the Satellite the command to turn off TX offloading would be:

ethtool -K eth0 tx off

Root Cause

https://bugzilla.redhat.com/show_bug.cgi?id=816101
"PXE boot issue with macvtap"

Diagnostic Steps

  • Manually downloading the pxelinux.0 from the VM BIOS gPXE shell (accessed by pressing Ctrl-B) shows the following error:
gPXE> dhcp net0
DHCP (net0 52:54:00:56:4e:1f).... ok
gPXE> kernel tftp://10.20.2.220/pxelinux.0
tftp://10.20.2.220/pxelinux.0............. Connection timed out (0x4c126035)
Could not fetch tftp://10.20.2.220/pxelinux.0: Connection timed out (0x4c126035) 

gPXE>
  • Gather a network trace from the satellite showing the traffic between Satellite and VM guest:
# tcpdump -i eth0 -s0 -w network-trace.pcap

No.     Time        Source                Destination           Protocol Info
     80 9.263243    10.20.2.220           10.20.2.250           DHCP     DHCP Offer    - Transaction ID 0x9376bb
    118 10.250118   10.20.2.220           10.20.2.250           DHCP     DHCP Offer    - Transaction ID 0x9376bb
    125 12.227488   10.20.2.220           10.20.2.250           DHCP     DHCP ACK      - Transaction ID 0x9376bb
    131 13.380343   10.20.2.250           10.20.2.220           TFTP     Read Request, File: pxelinux.0\000, Transfer type: octet\000
    132 13.384756   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    135 14.386168   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    136 14.918302   10.20.2.250           10.20.2.220           TFTP     Read Request, File: pxelinux.0\000, Transfer type: octet\000
    137 14.919102   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    142 15.920294   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    155 16.388359   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    159 17.922434   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    160 17.994094   10.20.2.250           10.20.2.220           TFTP     Read Request, File: pxelinux.0\000, Transfer type: octet\000
    161 17.994958   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    165 18.996212   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    168 20.392477   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    204 21.000954   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    607 21.926674   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    612 25.004966   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    631 28.400775   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement
    636 29.934933   10.20.2.220           10.20.2.250           TFTP     Option Acknowledgement

From the tcpdump output the satellite (10.20.2.220) is receiving the tftp 'Read-Request' for pxelinux.0 from the VM (10.20.2.250 from DHCP) and the satellite is responding with an 'Option-Acknowledge', however the bug causes the response to have an incorrect checksum. Thus the VM keeps doing "Read-Request" indicating that it doesn't receive the packets from the satellite.

From the /etc/libvirt/qemu/VM.xml file, the VM is using a macvtap direct network connection:

    <interface type='direct'>
      <mac address='52:54:00:93:76:af'/>
      <source dev='bond2' mode='bridge'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments