Chapter 8. Configuring Cluster Discovery

Running Data Grid on hosted services requires using discovery mechanisms that are adapted to network constraints that individual cloud providers impose. For instance, Amazon EC2 does not allow UDP multicast.

Data Grid can use the following cloud discovery mechanisms:

  • Generic discovery protocols (TCPPING and TCPGOSSIP)
  • JGroups PING protocols (KUBE_PING and DNS_PING)
  • Cloud-specific PING protocols
Note

Embedded Data Grid requires cloud provider dependencies.

8.1. TCPPING

TCPPING is a generic JGroups discovery mechanism that uses a static list of IP addresses for cluster members.

To use TCPPING, you must add the list of static IP addresses to the JGroups configuration file for each Data Grid node. However, the drawback to TCPPING is that it does not allow nodes to dynamically join Data Grid clusters.

TCPPING configuration example

<config>
      <TCP bind_port="7800" />
      <TCPPING timeout="3000"
           initial_hosts="${jgroups.tcpping.initial_hosts:localhost[7800],localhost[7801]}"
           port_range="1"
           num_initial_members="3"/>
...
...
</config>

Reference

JGroups TCPPING

8.2. Gossip Router

Gossip routers provide a centralized location on the network from which your Data Grid cluster can retrieve addresses of other nodes.

You inject the address (IP:PORT) of the Gossip router into Data Grid nodes as follows:

  1. Pass the address as a system property to the JVM; for example, -DGossipRouterAddress="10.10.2.4[12001]".
  2. Reference that system property in the JGroups configuration file.

Gossip router configuration example

<config>
    <TCP bind_port="7800" />
    <TCPGOSSIP timeout="3000" initial_hosts="${GossipRouterAddress}" num_initial_members="3" />
...
...
</config>

8.3. DNS_PING

JGroups DNS_PING queries DNS servers to discover Data Grid cluster members in Kubernetes environments such as OKD and Red Hat OpenShift.

DNS_PING configuration example

<stack name="dns-ping">
...
    <dns.DNS_PING
      dns_query="myservice.myproject.svc.cluster.local" />
...
</stack>

Reference

8.4. KUBE_PING

JGroups Kube_PING uses a Kubernetes API to discover Data Grid cluster members in environments such as OKD and Red Hat OpenShift.

KUBE_PING configuration example

<config>
    <TCP bind_addr="${match-interface:eth.*}" />
    <kubernetes.KUBE_PING />
...
...
</config>

KUBE_PING configuration requirements

  • Your KUBE_PING configuration must bind the JGroups stack to the eth0 network interface. Docker containers use eth0 for communication.
  • KUBE_PING uses environment variables inside containers for configuration. The KUBERNETES_NAMESPACE environment variable must specify a valid namespace. You can either hardcode it or populate it via the Kubernetes Downward API.
  • KUBE_PING requires additional privileges on Red Hat OpenShift. Assuming that oc project -q returns the current namespace and default is the service account name, you can run:

    $ oc policy add-role-to-user view system:serviceaccount:$(oc project -q):default -n $(oc project -q)

8.5. NATIVE_S3_PING

On Amazon Web Service (AWS), use the S3_PING protocol for discovery.

You can configure JGroups to use shared storage to exchange the details of Data Grid nodes. NATIVE_S3_PING allows Amazon S3 as the shared storage but requires both Amazon S3 and EC2 subscriptions.

NATIVE_S3_PING configuration example

<config>
    <TCP bind_port="7800" />
    <org.jgroups.aws.s3.NATIVE_S3_PING
            region_name="replace this with your region (e.g. eu-west-1)"
            bucket_name="replace this with your bucket name"
            bucket_prefix="replace this with a prefix to use for entries in the bucket (optional)" />
</config>

NATIVE_S3_PING dependencies for embedded Data Grid

<dependency>
  <groupId>org.jgroups.aws.s3</groupId>
  <artifactId>native-s3-ping</artifactId>
  <version>${version.jgroups.native_s3_ping}</version>
</dependency>

8.6. JDBC_PING

JDBC_PING uses JDBC connections to shared databases, such as Amazon RDS on EC2, to store information about Data Grid nodes.

Reference

JDBC_PING Wiki

8.7. AZURE_PING

On Microsoft Azure, use a generic discovery protocol or AZURE_PING, which uses shared Azure Blob Storage to store discovery information.

AZURE_PING configuration example

<azure.AZURE_PING
	storage_account_name="replace this with your account name"
	storage_access_key="replace this with your access key"
	container="replace this with your container name"
/>

AZURE_PING dependencies for embedded Data Grid

<dependency>
  <groupId>org.jgroups.azure</groupId>
  <artifactId>jgroups-azure</artifactId>
  <version>${version.jgroups.azure}</version>
</dependency>

8.8. GOOGLE2_PING

On Google Compute Engine (GCE), use a generic discovery protocol or GOOGLE2_PING, which uses Google Cloud Storage (GCS) to store information about the cluster members.

GOOGLE2_PING configuration example

<org.jgroups.protocols.google.GOOGLE_PING2 location="${jgroups.google.bucket_name}" />

GOOGLE2_PING dependencies for embedded Data Grid

<dependency>
  <groupId>org.jgroups.google</groupId>
  <artifactId>jgroups-google</artifactId>
  <version>${version.jgroups.google}</version>
</dependency>