ping and ibnetdiscover not working with realtime kernel after opensm server is restarted
Issue
We experienced some issue with our Infiniband infrastructure after the restart of our primary opensm server:
- Some servers were not able to
pingeach other via IPoIB network while theibpingcommand worked fine. - The same servers were not able to execute
ibnetdiscovercommand because it hanged. - The problem was not extended to all servers and we solved the issue via reboot of the affected nodes.
- sometimes while switching between different opensm masters a random opensm client crashes and dumps a vmcore
Environment
- Red Hat Enterprise Linux 6
- kernel-3.0.36-rt57.66.el6rt.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
