Chapter 26. Integration with Apache Spark

JBoss Data Grid includes a Spark connector, providing tight integration with Apach Spark, and allowing applications written either in Java or Scala to utilize JBoss Data Grid as a backing data store. This connector includes support for the following:
  • Create an RDD from any cache
  • Write a key/value RDD to a cache
  • Create a DStream from cache-level events
  • Write a key/value DStream to a cache

Note

Support for Apache Spark is only available in Remote Client-Server Mode.

26.1. Spark Dependencies

JBoss Data Grid utilizes Apache Spark 1.6 and supports Scala 2.10. The connector's maven coordinates are:
<dependency>
    <groupId>org.infinispan</groupId>
    <artifactId>infinispan-spark_2.10</artifactId>
    <version>0.3.0.Final-redhat-1</version>
</dependency>