Chapter 21. Administering the Hortonworks Data Platform on Red Hat Storage
The following are the advantages of Hadoop Compatible Storage with Red Hat Storage:
- Provides file-based access to Red Hat Storage volumes by Hadoop while simultaneously supporting POSIX features for the volumes such as NFS Mounts, Fuse Mounts, Snapshotting and Geo-Replication.
- Eliminates the need for a centralized metadata server (HDFS Primary and Redundant Namenodes) by replacing HDFS with Red Hat Storage.
- Provides compatibility with MapReduce and Hadoop Ecosystem applications with no code rewrite required.
- Provides a fault tolerant file system.
- Allows co-location of compute and data and the ability to run Hadoop jobs across multiple namespaces using multiple Red Hat Storage volumes.
21.1. Deployment Scenarios
Table 21.1. Component Overview
|Component Overview||Component Description|
|Ambari||Management Console for the Hortonworks Data Platform|
|Red Hat Storage Console||(Optional) Management Console for Red Hat Storage|
|YARN Resource Manager||Scheduler for the YARN Cluster|
|YARN Node Manager||Worker for the YARN Cluster on a specific server|
|Job History Server||This logs the history of submitted YARN Jobs|
|glusterd||This is the Red Hat Storage process on a given server|
21.1.1. Red Hat Storage Trusted Storage Pool with Two Additional Servers
Figure 21.1. Recommended Deployment Topology for Large Clusters
21.1.2. Red Hat Storage Trusted Storage Pool with One Additional Server
Figure 21.2. Recommended Deployment Topology for Smaller Clusters
21.1.3. Red Hat Storage Trusted Storage Pool only
Figure 21.3. Evaluation deployment topology using the minimum amount of servers
21.1.4. Deploying Hadoop on an existing Red Hat Storage Trusted Storage Pool
21.1.5. Deploying Hadoop on a New Red Hat Storage Trusted Storage Pool
setup_cluster.shscript can build the storage pool for you. The rest of the installation instructions will articulate how to create and enable volumes for use with Hadoop.