Received invalid rebalance confirmation from NodeX in a cluster.

Solution Unverified - Updated -

Environment

  • Red Hat JBoss Data Grid (JDG)
    • 6.x
    • 7.x
  • Red Hat JBoss Enterprise Application Platform (EAP)
    • 6.x
    • 7.x

Issue

  • When 6 JDG node are running in parallel, The following WARN message appears. What does this mean? How to run nodes in parallel?
(NodeA-root 2013/07/24-17:08:02)# 2013-07-24 17:08:02.405 [WARN] [OOB-6,shared=tcp] ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=CacheX, type=REBALANCE_CONFIRM, sender=NodeB/clustered, joinInfo=null, topologyId=3, currentCH=null, pendingCH=null, throwable=null, viewId=3}: org.infinispan.CacheException: Received invalid rebalance confirmation from NodeB/clustered for cache CacheX, we don't have a rebalance in progress
  • Getting above exception when started two nodes (EAPs) of cluster simultaneously.

Resolution

  • This message is harmless and can be ignored if there are no other errors.

  • This message can be avoided by not booting multiple instances simultaneously. Try to boot the nodes one by one.

Root Cause

This message could appear when cluster members booted simultaneously.

  • A node joined/left, starting rebalance
  • Infinispan constructs REBALANCE_START message, expected replies from current members
  • A new node joined before sending the REBALANCE_START message
  • Send REBALANCE_START message to all members (current members + the new member)
  • The new member receives the REBALANCE_START and reply
  • Infinispan doesn't know about this member and prints the WARN message

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.