OpenSM doesn't balance routing properly
Issue
During a system shutdown and reboot for building maintenance, two IB switches (of 582 total) needed to be rebooted to get fully configured. Routing load balancing issues seem to be causing some sub-optimal performance. Bisecion bandwidth for Eight 16-node subclusters (referred as IRUs) are between 5MB/s to 47MB/s, where the expected rate is 57MB/s. re-cycling the fabric does not seem to resolve the issue.
Environment
Red Hat Enterprise Linux 6.2
System is a 2304 node enhanced hypercube SGI ICE 8400.
opensm-libs-3.3.13-1.el6.x86_64
opensm-3.3.13-1.el6.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.