Open a terminal session on the YARN Master Server and run the following commands:
# chown -R yarn:hadoop /mnt/brick1/hadoop/yarn/
# chmod -R 0755 /mnt/brick1/hadoop/yarn/
Prior to submitting any jobs, ensure that the trusted storage pool is running. Launch the Ambari Dashboard (http://ambari-server-hostname
) and select the YARN service and then click the button.
Stopping and starting the services takes some time. If one of the services fails to start, it will often start if you select the service and restart it.
The default volume (usually HadoopVol) must always be running when you are running Hadoop Jobs on other volumes. This is because the user directories for all the deployed Hadoop processes are stored on this volume. For example, if you have created and enabled 3 volumes for use with Hadoop (HadoopVol, MyVolume1, MyVolume2) and you are running a Hadoop Job thats reads from MyVolume1 and writes to MyVolume2, then HadoopVol must still be running.
To test your trusted storage pool, shell into the YARN Master server and navigate to the
/usr/lib/hadoop/ directory. Then
su to one of the users you have enabled for Hadoop (such as tom) and submit a Hadoop Job:
# su tom
# cd /usr/lib/hadoop
# bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-220.127.116.11.1.7.0-784.jar teragen 1000 in
TeraGen only generates data. TeraSort reads and sorts the output of TeraGen. In order to fully test the cluster is operational, one needs to run TeraSort as well.
# bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-18.104.22.168.1.7.0-784.jar terasort in out
For more information on using specific components within the Hadoop Ecosystem, see Chapter 2. Understanding the Hadoop Ecosystem in the Hortonworks Data Platform documentation.