RHEL6 kvm guest server getting network packet drop

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux 6.3

Issue

  • Ping test shows intermittent packet loss:
[root@kvmhost123 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 7     guest1001                     running
 8     guest1002                      running
 11    guest1003                      running
 12    guest1004                      running
 15    guest1005                      running
 16    guest1006                      running

[root@kvmhost123 ~]#
  • Packet drop:
Ping statistics for XX.XX.XX.X:
    Packets: Sent = 291, Received = 265, Lost = 26 (8% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 2522ms, Average = 17ms

Resolution

  • Upgrade to 6.4.z kernel 2.6.32-358.23.1.el6 or higher, or
  • Upgrade to a 6.5 kernel.

Source RPM Viewer

Root Cause

A previous change in the bridge multicast code allowed sending general multicast queries in order to achieve faster convergence on startup. To prevent interference with multicast routers, send packets contained a zero source IP address. However, these packets interfered with certain multicast-aware switches, which resulted in the system being flooded with the IGMP membership queries with zero source IP address. A series of patches addresses this problem by disabling multicast queries by default and implementing multicast querier that allows to toggle up sending of general multicast queries if needed.

Diagnostic Steps

Ok, we did two 30 count ping packet tests, we'll focus on the second run.

We miss our 12th sequence

64 bytes from guest1001.example.com (10.XX.XX.XX): icmp_seq=10 ttl=59 time=1.83 ms
64 bytes from guest1001.example.com (10.XX.XX.XX): icmp_seq=11 ttl=59 time=1.61 ms

And then again we miss our 19 through 24 sequence

64 bytes from guest1001.example.com (10.XX.XX.XX): icmp_seq=18 ttl=59 time=1.82 ms
64 bytes from guest1001.example.com (10.XX.XX.XX): icmp_seq=25 ttl=59 time=1.72 ms

We see all ping requests from the kvm host bridge tcpdump.

But we don't see the ping requests in kvm guest or vnet interace.

What stands out is that during each missed ping sequence there is an IGMPv2 query close by?

674 49.982981   10.XX.XX.XX 10.XX.XX.XX ICMP    98  Echo (ping) reply    id=0xd26b, seq=11/2816, ttl=64
675 50.020287   0.0.0.0 all-systems.mcast.net   IGMPv2  46  Membership Query, general
790 54.705071   0.0.0.0 all-systems.mcast.net   IGMPv2  56  Membership Query, general
816 56.991950   10.XX.XX.XX 10.XX.XX.XX ICMP    98  Echo (ping) reply    id=0xd26b, seq=18/4608, ttl=64
817 57.122483   10.XX.XX.XX 232.XX.XXX.XX   IGMPv2  46  Membership Report group 232.XX.XXX.XX

So it is evident that the IGMPv2 query is causing problems.

Looking at the last 6.4z changelog we see a solid match to this issue

RHEL6.4.z changelog

Documented in the following BZ:

bridge: sending IGMP membership query with zero source address causes the switch to flood

bridge: sending IGMP membership query with zero source address causes the switch to flood. [rhel-6.4.z]
bridge: sending IGMP membership query with zero source address causes the switch to flood.

We will need you to update the kernel to 2.6.32-358.23.1.el6 (6.4.z) or a 6.5 kernel which should have the fix as well.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.