Chapter 3. Creating the Environment
3.1. Prerequisites
Prerequisites for creating this reference architecture include a supported Operating System and JDK. Refer to Red Hat documentation for JBoss Data Grid 7.0 Supported Configurations.
3.2. Downloads
Download the attachments to this document. These application code and files will be used in configuring the reference architecture environment:
https://access.redhat.com/node/2640031/40/0
If you do not have access to the Red Hat customer portal, See the Comments and Feedback section to contact us for alternative methods of access to these files.
Download the Red Hat JBoss Data Grid 7.0.0 Server from Red Hat’s Customer Support Portal
Download Apache Spark 1.6 from Apache Spark website download page
This reference architecture use the spark-1.6.0-bin-hadoop2.6 build
3.3. Installation
3.3.1. Apache Spark
Installing Apache Spark is very simple and mainly involves extracting the downloaded archive file on each node.
# tar xvf spark-1.6.0-bin-hadoop2.6.tgz3.3.2. JBoss Data Grid 7
JBoss Data Grid 7 does not require any installation steps. The archive file simply needs to be extracted after the download. This reference architecture requires installation of JBoss Data Grid 7.0.0 Server on each node.
# unzip jboss-datagrid-7.0.0-server.zip -d /opt/
3.4. Configuration
3.4.1. Overview
Various other types of configuration may be required for UDP and TCP communication. For example, Linux operating systems typically have a low maximum socket buffer size configured, which is lower than the default cluster JGroups buffer size. It may be important to correct any such warnings observed in the JDG logs. For more information, please follow the Administration and Configuration Guide for JDG 7
3.4.2. JDG 7 configuration
This reference architecture installs and configures a three-node cluster on separate machines. The names node1, node2 and node3 are used in this paper to refer to both the machines and the JDG 7 nodes on them.
Figure 3.1. Deployment Clusters

3.4.2.1. Adding Users
The first important step in configuring the JDG 7 clusters is to add the required users. They are Admin Users and Node Users.
1) Admin User
An administrator user is required for each domain. Assuming the user ID of admin and the password of password1! for this admin user:
On node1:
# /opt/jboss-datagrid-7.0.0-server/bin/add-user.sh admin password1!This uses the non-interactive mode of the add-user script, to add management users with a given username and password.
2) Node Users
The next step is to add a user for each node that will connect to the cluster. That means creating two users called node2 and node3 (since node1 hosts the domain controller and does not need to use a password to authenticate against itself). This time, provide no argument to the add-user script and instead follow the interactive setup.
The first step is to specify that a management user is being added. The interactive process is as follows:
# /opt/jboss-datagrid-7.0.0-server/bin/add-user.sh
What type of user do you wish to add?
a) Management User (mgmt-users.properties)
b) Application User (application-users.properties)
(a): a
- Simply press enter to accept the default selection of a
Enter the details of the new user to add.
Realm (ManagementRealm) :
- Once again simply press enter to continue
Username : node2
- Enter the username and press enter (node1, node2 or node3)
Password : password1!
- Enter password1! as the password, and press enter
Re-enter Password : password1!
- Enter password1! again to confirm, and press enter
About to add user 'node_X_' for realm 'ManagementRealm'
Is this correct yes/no? yes
- Type yes and press enter
The continue:
Is this new user going to be used for one AS process to connect to another AS process?
e.g. for a slave host controller connecting to the master or for a Remoting connection for server to server EJB calls.
yes/no?
- Type yes and press enter
To represent the user add the following to the server-identities definition <secret value="cGFzc3dvcmQxIQ==" />
This concludes the setup of required management users to administer the domains and connect the slave machines.
3.4.3. JDG 7 Cache configuration
On node1, execute the following scripts to add two new JDG 7 distributed caches, sensor-data and sensor-avg-data. These 2 caches will be used by the sample IoT sensor application.
# /opt/jboss-datagrid-7.0.0-server/bin/cli.sh # embed-host-controller --domain-config=domain.xml --host-config=host.xml --std-out=echo # /profile=clustered/subsystem=datagrid-infinispan/cache-container=clustered/configurations=CONFIGURATIONS/distributed-cache-configuration=sensor-data:add(start=EAGER,template=false,mode=SYNC) # /profile=clustered/subsystem=datagrid-infinispan/cache-container=clustered/distributed-cache=sensor-data:add(configuration=sensor-data) # /profile=clustered/subsystem=datagrid-infinispan/cache-container=clustered/configurations=CONFIGURATIONS/distributed-cache-configuration=sensor-avg-data:add(start=EAGER,template=false,mode=SYNC) # /profile=clustered/subsystem=datagrid-infinispan/cache-container=clustered/distributed-cache=sensor-avg-data:add(configuration=sensor-avg-data) 3.4.4. JDG 7 Cluster configuration
1) Update /opt/jboss-datagrid-7.0.0-server/domain/configuration/host-slave.xml on both node2 and node3, so these 2 nodes can form a JDG cluster with node1.
Update the first line for node2, adding host name.
<host name="node2" xmlns="urn:jboss:domain:4.0">
Update the first line for node3, adding host name.
<host name="node3" xmlns="urn:jboss:domain:4.0">
2) Update node2 and node3’s host-slave.xml, change server-identities value from default sample value to this new value, which is the encrypted password value of the node user from last section.
<server-identities>
<secret value="cGFzc3dvcmQxIQ=="/>
</server-identities>
3) Update the server name for node1 in host.xml, by deleting the server-two tag.
<server name="server-two" group="cluster" auto-start="true">
Update the server name to server-two for node2 in host-slave.xml.
<server name="server-two" group="cluster"/>
Update the server name to server-three- for node3 in _host-slave.xml.
<server name="server-three" group="cluster"/>
After these change, the cluster will have 3 members, server-one on node1, server-two on node2 and server-three on node3.
3.5. Startup
To start the active domain, assuming that 10.19.137.34 is the IP address for the node1 machine, 10.19.137.35 for node2 and 10.19.137.36 for node3:
3.5.1. Start JDG 7.0 cluster
Log on to the three machines where JDG 7 is installed and navigate to the bin directory:
# cd /opt/jboss-datagrid-7.0.0-server/binTo start the first node
# ./domain.sh -bmanagement=10.19.137.34 -b=10.19.137.34To start the second node
# ./domain.sh -b=10.19.137.35 -bprivate=10.19.137.35 --master-address=10.19.137.34 --host-config=host-slave.xmlTo start the third node
# ./domain.sh -b=10.19.137.36 -bprivate=10.19.137.36 --master-address=10.19.137.34 --host-config=host-slave.xml3.5.2. Stop JDG 7.0 cluster
To stop the sensor applications, press ctrl-c or use "kill -9 PID" to stop the process.
3.5.3. Start Apache Spark cluster
Apache Spark currently supports three types of cluster managers:
- Standalone – a simple cluster manager included with Spark that makes it easy to set up a cluster.
- Apache Mesos – a general cluster manager that can schedule short-lived tasks and long-running services on shared compute resources.
- Hadoop YARN – the resource manager in Hadoop 2.
This reference architecture uses the standalone cluster mode.
Each streaming receiver will use a CPU core / thread from the processors allocated to Apache Spark. Ensure that the Spark application always has a higher number of CPU cores than receivers. Failure to allocate at least one extra processing core can result in receivers running but no data being processed by Spark.
3.5.3.1. Start Apache Spark standalone cluster
By default, Apache Spark uses port 8080 for its Web UI, which is coincidentally the same port used by JBoss Data Grid, as configured in its domain.xml:
<socket-binding name="rest" port="8080"/>
Therefore, an attempt to start Apache Spark on the same host as JDG 7 may result in the following exception due to a port conflict:
ERROR [org.jboss.msc.service.fail] (MSC service thread 1-3) MSC000001: Failed to start service jboss.datagrid-infinispan-endpoint.rest.rest-connector: org.jboss.msc.service.StartException in service jboss.datagrid-infinispan-endpoint.rest.rest-connector: DGENDPT10016: Could not start the web context for the REST Server [Server:server-one] at org.infinispan.server.endpoint.subsystem.RestService.start(RestService.java:110)
To avoid such a port conflict, please start Apache Spark with the --webui-port argument to use a different port.
On Node 1 (10.19.137.34), start both master and worker.
# cd /opt/spark-1.6.0-bin-hadoop2.6/sbin # ./start-master.sh --webui-port 9080 -h 10.19.137.34 # ./start-slave.sh spark://10.19.137.34:7077 --webui-port 9081 On Node 2 (10.19.137.35), start one worker.
# cd /opt/spark-1.6.0-bin-hadoop2.6/sbin # ./start-slave.sh spark://10.19.137.34:7077 --webui-port 9081 On Node 3 (10.19.137.36), start one worker.
# cd /opt/spark-1.6.0-bin-hadoop2.6/sbin # ./start-slave.sh spark://10.19.137.34:7077 --webui-port 9081
3.5.4. Stop Apache Spark cluster
On Node1, stop both master and worker.
# cd /opt/spark-1.6.0-bin-hadoop2.6/sbin # ./stop-slave.sh # ./stop-master.sh On Node2 and Node3, only need to stop the worker.
# cd /opt/spark-1.6.0-bin-hadoop2.6/sbin # ./stop-slave.sh The whole Apache Spark cluster can also be started and stopped using launch scripts, like sbin/start-all.sh and sbin/stop-all.sh, which need additional configuration. For details, please refer to Cluster Launch Scripts
3.5.5. Start IoT sensor application
3.5.5.1. Start Spark analysis application
# /opt/spark-1.6.0-bin-hadoop2.6/bin/spark-submit --master spark://10.19.137.34:7077 --deploy-mode cluster --supervise --class org.Analyzer target/ref-analysis-jar-with-dependencies.jar 10.19.137.34:11222;10.19.137.35:11222;10.19.137.36:11222
The arguments provided to spark-submit are as follows:
- master: The master URL for the cluster
-
deploy-mode cluster: deploy the driver on the worker nodes (
cluster) - supervise: to make sure that the driver is automatically restarted if it fails with non-zero exit code.
- class: entry point of the application
- The last argument is the JDG 7 cluster address, which is used in the Spark connector.
For more information on how to use spark-submit, please refer to this link
3.5.5.2. Start Client application
# java -jar target/temperature-client-jar-with-dependencies.jar 10.19.137.34 shipment1 shipment5 shipment9 The arguments to this application include:
- The first argument is the address for the Hot Rod Java Client to connect to JDG 7, in this example it is 10.19.137.34. Since the JDG cluster has three nodes (10.19.137.34, 10.19.137.35 and 10.19.137.36), using either one of the IP addresses will work.
- After that, it’s the shipment ID strings that the client will be listening to. It doesn’t have to be an exact match, for example "shipment1" will bring back all shipments with an ID starting with "shipment1", like shipment101 or shipment18.
3.5.5.3. Start Sensor application
# java -jar target/temperature-sensor-jar-with-dependencies.jar 10.19.137.34 The first argument is the address for the Hot Rod Java Client to connect to JDG 7, in this example it is 10.19.137.34. Since the JDG cluster has three nodes (10.19.137.34, 10.19.137.35 and 10.19.137.36), using either one of the IP addresses is fine.
3.5.6. Stop IoT sensor application
To stop the Sensor applications, press ctrl-c or use "kill -9 PID" to stop the process. Otherwise, all 3 applications are set up to run for 24 hours.

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.