Network traffic grinds to halt under heavy NFS traffic after upgrading from RHEL5.9 to RHEL5.10
Issue
-
This is a Netbackup server and it effectively hung. We couldn't login any more over the network or on the console.
-
After upgrading from RHEL 5.9 to RHEL 5.10 the system throughput on an interface used by NFS was badly degraded.
-
Netbackup runs a lot of parallel jobs. When this happens the NICs run a while with full speed and then they seem to freeze and not much packages go through any more. It's like the NICS go offline. But ssh logins succeed eventually. My backup admin then kills a few jobs off which gets it back again to run with full speed (120MB/s). So it looks like the NIC usage goes really high. But instead of keeping it high and queueing up the traffic it stops all traffic until someone clears out a few of the parallel jobs. This is all NFS traffic.
Environment
- RHEL 5.10
- kernel-2.6.18-371.1.2.el5
- Netbackup
- Dual CPU system with 6GB RAM
- NFS
- Ethernet driver bnx2
eth0 link=up 1000Mb/s full (autoneg=Y) drv bnx2 v2.1.11 / fw bc 1.9.6
eth1 link=up 1000Mb/s full (autoneg=Y) drv bnx2 v2.1.11 / fw bc 1.9.6
- BIOS:
HP, version P56, 05/02/2011
System:
Mfr: HP
Prod: ProLiant DL380 G5
Vers: Not Specified
Ser: XXXXXXX
UUID: 34313734-XXXX-YYYY-ZZZZ-333030334A4D
- Network cards
NC373i Integrated Multifunction Gigabit Server Adapter
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
05:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
+-1c.0-[02-03]----00.0-[03]----00.0 Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
+-1c.1-[04-05]----00.0-[05]----00.0 Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.