ixgbe vs ixgbevf Performance Differences
When deploying into AWS and wishing to make use of optimized networking (10Gbps), the AWS Documentation states that the ixgbevf drivers must be used. However, the drivers bundled with the EL6 and EL7 kernels is not compatible with the AWS SR-IOV implementation. One can use third-party ixgbevf drivers, but it's kind of painful. The other option is to ignore the AWS guidance and use the ixgbe drivers, instead.
I'm just wondering if anyone's benchmarked the performance - and associated instance-overhead - of using the Red Hat bundled ixgbe drivers or using the AWS-recommended ixgbevf drivers.
AWS also offers using Elastic Network Adapters, but that depends on running a 3.2+ kernel. That's fine for RHEL 7, but pretty much all of my customers are on or deploying on RHEL 6.
Responses
That's rather unexpected, Intel PF and VF use different PCI device IDs. This from EL7:
drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
#define IXGBE_DEV_ID_X550T 0x1563
#define IXGBE_DEV_ID_X550T1 0x15D1
...
/* VF Device IDs */
#define IXGBE_DEV_ID_82599_VF 0x10ED
#define IXGBE_DEV_ID_X540_VF 0x1515
#define IXGBE_DEV_ID_X550_VF 0x1565
I didn't think it would be possible to use the PF driver (ixgbe) on a VF (ixgbevf) or vice versa.
What IDs does lspci -nn | egrep Ether show?
If you can load the PF driver to drive the VF, that doesn't sound like a particularly good idea. There are many code paths in the physical driver which would probably fail on a virtual function.
Why don't the EL6 drivers work? The 82599 is an early 10GbE chipset and the device ID for the VF is in the EL6 driver:
drivers/net/ixgbe/ixgbe_type.h
/* VF Device IDs */
#define IXGBE_DEV_ID_82599_VF 0x10ED
This device ID and the function ixgbe_enable_sriov() have been present since RHEL 6.2.
I assume there's some earlier bug they consider the platform vulnerable to hitting. Here's the changelog on Linus' tree from 2.2.0 to 2.6.0 (there was never a 2.4.x here):
$ git log --oneline c1a7e1e^..9cd9130 drivers/net/ethernet/intel/ixgbevf
9cd9130 ixgbevf: Update version string
795180d ixgbevf: Make sure jumbo frames are set correctly after PF reset
31a1b37 ixgbevf: Add support to recognize 100mb link speed
b3f4d59 intel: make wired ethernet driver message level consistent (rev2)
f794e7e ixgbevf: print MAC via printk format specifier
1a0d6ae rename dev_hw_addr_random and remove redundant second
dd48dc3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
5c47a2b ixgbevf: Update copyright notices
3a2c403 ixgbevf: Fix mailbox interrupt ack bug
e404dec drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages
3d8fe98 ixgbevf: make operations tables const
b5417bf ixgbevf: fix sparse warnings
b47aca1 ixgbevf: make ethtool ops and strings const
375b27c ixgbevf: Prevent possible race condition by checking for message
f131a6c ixgbevf: Fix register defines to correctly handle complex expressions
8e58613 net: make vlan ndo_vlan_rx_[add/kill]_vid return error value
1f2149c net: remove netdev_alloc_page and use __GFP_COLD
84b4050 Sweep away N/A fw_version dustbunnies from the .get_drvinfo routine of a number of drivers
f85fa27 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
c8f44af net: introduce and use netdev_features_t for device features sets
ea99d83 intel: Convert <FOO>_LENGTH_OF_ADDRESS to ETH_ALEN
dbd9636 ixgbevf: Convert printks to pr_<level>
c1a7e1e ixgbevf: Update release version
At least the above result looks like it gives you a workaround.
Maybe Amazon could quantify why they recommend a later ixgbevf? If it's to work around a bug, chances are probably low at getting any patches in RHEL6 now, so the solution would be to run the Sourceforge driver or use a different interface type which has a good driver in the EL6 kernel.
I checked ELRepo but their kmod-ixgbevf is 2.16. They may accept a request on their mailing list to update it if you prefer their packaged module over Sourceforge.
I can't offer much except some version information. Upstream has had ena since v4.8 though maybe the driver will run on 3.2 or later. Upstream has had ixgbevf since 2.6.34 though I didn't look into exactly which models were supported when.
I also dislike knowing how to get something working but not really knowing the underlying reason why it works.
Tom, FYI I've started knowledgebase solution AWS Enhanced Networking requires ixgbevf driver version 2.14 about this, and I'm trying to find out exactly which upstream patches are required so we can determine if our modules actually can do this despite them being based on 2.12.
I suspect that EL7 already includes the commit needed, maybe EL6 does as well, or maybe there's some corner-case bug resolved in 2.14 which most customers won't hit and they're just covering themselves against that.
I also just realised I missed a version above. ELRepo has 2.16 which is greater than 2.14 so that's a better option for those unable/unwilling to have a compiler installed.
I have seen customer environments which don't even have gcc/g++ in their Satellite repo to adhere with security requirements.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
