jgroups RouterStubManager race condition that cause one or more gossip router never get reconnected

Solution Verified - Updated -

Issue

There was a network activity that was planned, post the activity RH-SSO pods were unable to reconnect.Only restart of all RH-SSO pods helped resolve the issue
The RHSSO nodes seems lose connection to one of the gossip router and never try to reconnect to the gossip router again.
When there is network issue that cause disconnect to 2 or more gossip router, there will be problem that it may fails to remove the affected RouterStub list (instance variable - 'stubs') due to race condition. This problem seems exists up to jgroups 4.2.22 (the latest 4.x version).

Environment

  • Red Hat Single Sign-On
    • 7.6.2
  • JBoss Enterprise Application Platform
    • 7.2.12

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content